[okfn-discuss] Submitting comments to the Library of Congress?
Jonathan Gray
jonathan.gray at okfn.org
Wed Dec 12 05:59:59 UTC 2007
Hi,
After meeting with Aaron Swartz of The Open Library, we thought having a
shorter version might be good to get more signatories before Friday.
Aaron's prepared something here:
http://www.okfn.org/wiki/OpenBibliographicData
Again - comments appreciated.
Do people think it should be more/less strong?
Regards,
Jonathan
Jonathan Gray wrote:
> Hi all,
>
> Below is a (rather long) draft response to the LoC draft report:
>
> http://www.okfn.org/wiki/FutureOfBibliographicControl
>
> Any comments would be much appreciated!
>
> (Its also inline below.)
>
> Regards,
>
> Jonathan
>
>
> = Open bibliographic data? - comments on draft report by the Working
> Group on the Future of Bibliographic Control at the Library of Congress =
>
> 14th December 2007
>
>
> '''Rufus Pollock, The Open Knowledge Foundation''' [[BR]]
> '''Jonathan Gray, The Open Knowledge Foundation''' [[BR]]
> '''Peter Suber, The Scholarly Publishing and Academic Resources
> Coalition''' [[BR]]
> '''Aaron Schwartz, The Open Library'''[[BR]]
>
>
> == Introduction ==
>
> This document is a response to the call for comments on a draft released
> by the Working Group on the Future of Bibliographic Control on 30th
> November 2007 [1].
>
> We think it is laudable that the Working Group have recommended that the
> Library of Congess takes a more active role in leading the library world
> into 21st century. Their vision of a bibliographic control ecosystem
> which is "collaborative, decentralized, international in scope and
> web-based" (p. 1) is timely.
>
> However, we are concerned that there is no explicit mention of the
> potential benefits of open licensing for bibliographic data. Over the
> past few years, open licensing has facilitated the explosive growth of a
> 'knowledge commons'. To give a few prominent examples: Open Access
> journals, Open Educational Resources and Open Data in scientific
> research [2] have all been enabled by licenses which permit material to
> be freely re-used and re-distributed [3].
>
> We believe open licensing would strongly help to catalyse the
> flourishing of an information ecology for bibliographic data - by
> allowing and encouraging anyone to share, modify and build on it. Openly
> licensed bibliographic data would allow users and developers to:
> * improve the quality of the data by correcting errors, and adding
> ancillary information;
> * attempt to harmonise and integrate data that is from multiple
> sources, in different formats and which adheres to different sets of
> standards;
> * use technologies such as wikis and versioning systems to facilitate
> the collaborative development of data [4];
> * host bibliographic data and experiment with distributed data
> provision and access;
> * combine bibliographic datasets with other material - such as
> user-contributed reviews, images and 'tags';
> * build innovative (web) applications to explore and represent the
> wealth of information contained in bibliographic records, e.g. through
> datamining and/or visualization technologies [5];
> * extract structured, machine-readable data from bibliographic records
> and to link this to other open datasets in the emerging semantic web of
> data [6].
>
> New kinds of technologies are emerging very rapidly - and we think that
> one of the best ways for the library community to see the fruits of
> these developments applied to bibliographic data is to permit greater
> experimentation with the data by the wider technical community - and the
> general public. Placing restrictions on how bibliographic records may be
> re-used effectively inhibits community-led development and innovative
> 'tinkering'. One of the implicit principles of more 'open' models of
> development is that 'the most interesting thing to be done with your
> material will be thought of by someone else'. This kind of thought
> resonates strongly with the "decentralised", "dynamic", "collaborative"
> ethos propagated in the report, in which users and third party
> organisations are encouraged to play a more active role in bibliographic
> control.
>
> == Summary of key comments ==
>
> * The potential benefits of open licensing should be mentioned in the
> draft. We've identified several places where such mention may be
> appropriate.
> * The draft should strive to acknowledge a broad spectrum of parties
> who may contribute to an ecosystem of bibliographic control, and who
> benefit from shared bibliographic data - including individual technical
> developers, enthusiasts and a diverse variety of third part
> organisations - rather than simply either libraries, library users and
> commercial contractors. (Cf. comments on p. 1, par. 1)
> * Open licensing can help to lower or remove transaction costs. (Cf.
> comments on p. 1, par. 1)
> * We urge that even if value-added data products or services are sold
> in order to recover costs, openly licensing 'raw' bibliographic data is
> still considered. (Cf. comments on p. 4, par. 3)
> * The LC takes into account short and long term opportunities to create
> 'public value' as well as opportunities for market growth when
> considering making alterations to its pricing structure. (Cf. comments
> on p. 8, par. 1; p. 13, sect. 1.1.4)
> * The report should explicitly acknowledge significant work by
> non-profit organisations in the areas of digitisation and bibliographic
> control as well as contributions of commercial vendors. (Cf. comments on
> p. 8, par. 2)
> * The Library of Congress should take a leading role in encouraging
> bibliographic data to be shared - encouraging other individual libraries
> to make their data available under an open license where possible. (Cf.
> comments on p. 8, par. 5)
> * Open bibliographic data would encourage relevant groups to improve
> and build on each other's work rather than doubling up effort in
> parallel development. (Cf. comments on p. 9, par. 1)
> * A strong culture of sharing bibliographic information may help
> libraries not become over-dependent on third party contractors to
> replace work currently done by Library of Congress. (Cf. comments on p.
> 15, sect. 1.2)
> * The products of digitizing material that is in the public domain
> should be made available under an open license where possible. (Cf.
> comments on pp. 19-20, sect. 2; p. 21, sect 2.4)
> * The Library of Congress should implement changes in metadata
> standards such that there is be a field within each bibliographic record
> to specify the license the record is available under (Cf. comments on
> pp. 21-26, sect. 3.)
>
>
> == Comments on the Draft Report ==
>
> N.B. We take 'bibliographic data' to refer to metadata concerning
> library holdings - primarily in the form of bibliographic records.
>
> === Introduction ===
>
> p. 1, par. 1
>
> "Its realization will occur in cooperation with the private sector, and
> with the active collaboration of library users."
> * The implied distinction - between formal cooperation with the private
> sector and input from ordinary library users - may become increasingly
> blurred. We think it would be valuable to recognise that there is
> potential for a broad spectrum of potential collaborators ranging
> between these two poles - including individual technical developers and
> smaller groups who might wish to re-use or add value to bibliographic
> data without necessarily, e.g., contracting with the relevant producer.
>
> "Data will be gathered from multiple sources; change will happen
> quickly; and bibliographic control will be dynamic, not static."
> * Open licensing would help to ensure that bibliographic control is
> dynamic and that change happens quickly by eradicating the requirement
> that every user asks permission from every data producer for each new
> application of bibliographic information.
>
> "Libraries must continue the transition to this future without delay in
> order to retain their relevance as information providers."
> * As mentioned above, openly licensing bibliographic material would
> help to accelerate this transition by allowing third parties to
> experiment with innovative ways of re-using it and building on it -
> including the development of new kinds of applications, services,
> plugins, and so on.
>
>
> === Background ===
>
> p. 4, par. 3
>
> "According to current congressional regulations, LC is permitted to
> recover only direct costs for services provided to others. As a result,
> the fees that the Library charges do not cover the most expensive aspect
> of cataloging: namely, the cost of the intellectual work. . The
> economics of creating LC's products have changed dramatically since the
> time when the Library was producing cards for library catalogs. It is
> now time to reevaluate the pricing of LC's product line in order to
> develop a business model that allows LC to more substantially recoup its
> actual costs."
>
> * Reevaluating product pricing is arguably one way among several
> towards cost recovery. Also, while the LC might recoup costs through
> revenue generated through value-added products and services - we hope
> this does not preclude any effort to encourage the circulation of its
> raw data.
>
> === Guiding Principles ===
>
> p. 7, par. 3
>
> "Different communities of bibliographic practice have grown up around
> different resource types: library collections of books and journals,
> archives, journal articles, and museum objects and images. As these
> resources and others become increasingly accessible through the Web,
> separation of the communities of practice that manage them is no longer
> desirable, sustainable, or functional. Bibliographic control is
> increasingly a matter of managing relationships—among works, names,
> concepts, and object descriptions—across communities. Consistency of
> description within any single environment, such as the library catalog,
> is becoming less significant than the ability to make connections
> between environments: Amazon to WorldCat to Google to PubMed to
> Wikipedia, with library holdings serving as but one node in this web of
> connectivity. In today's environment, bibliographic control cannot
> continue to be seen as limited to library catalogs."
>
> * Again, open licensing could be mentioned here, given this projected
> decentralisation and the importance of widespread collaboration among
> many different parties.
>
> p. 8, par. 1
>
> "Once considered a public good, information access is today a commodity
> in a rapidly-growing marketplace. Many information resources formerly
> managed in the not-for-profit sector are now the objects of a
> significant for-profit economy. Entities in this latter economy have
> financial capabilities far beyond those of libraries. Further, they have
> the resources to engage in large scale research and development."
>
> * We think its crucial here to strike a balance here between
> encouraging public benefit and market growth. Open licensed
> bibliographic data would allow the general public to benefit from being
> able to freely re-use and re-distribute the it, as well as commercial
> organisations to benefit from being able to re-use it in their products
> and services. Increased commercial exploitation would also arguably
> indirectly generate more revenue for government organisations such as LC
> through an increase in taxable profits. Open licensing also allows
> community driven development, which may in some cases yield similar or
> even preferable results to well funded closed models of development.
> Also open licensing is becoming increasingly popular for large
> for-profit enities, who may, for example, charge for associated services.
>
> p. 8, par. 2
>
> "Libraries of today need to recognize that they are but one group of
> players in a vast field, and that market conditions necessitate that
> libraries interact increasingly with the commercial sector. One example
> of such interaction can be found in the various mass digitization
> projects in which for-profit organizations are making use of library
> resources and library metadata."
>
> * It is also important to recognise new partnerships with non-profit
> organisations in this area - such as the important digitisation work
> being carried out by the Internet Archive and by The Open Library with
> members of the Open Content Alliance.
>
> p. 8, par. 5
>
> "Sharing, however, is not a strategy for LC alone. The entire library
> community and its many partners must also be part of it."
>
> * Again, by advocating liberal licensing practices on a wide scale -
> the LC could effectively encourage libraries to scale their
> bibliographic control operations by sharing their data.
>
> p. 9, par. 1
>
> "Is there duplicate effort being expended? Are there possible
> partnerships that could reduce the burden on the Library?"
>
> * Open licensing in this area would encourage relevant groups to
> improve and build on each other's work rather than doubling up effort in
> parallel development.
>
> p. 9, par. 4
>
> "In addition, the standards landscape in the library field is murky,
> with many different organizations working on similar standards in a
> non-coordinated fashion."
>
> * See comments on p. 9, par. 1, above.
>
> === Findings and Recommendations ===
>
> p. 11, sect. 1.1
>
> "The Working Group identified three primary areas of redundancy in the
> bibliographic production process:
> 1. the supply chain, wherein some data are created by publishers and
> vendors and later re-created by library catalogers;
> 2. the modification of records within the library community, wherein
> such modifications are not shared, even though they could be useful to
> others; and
> 3. the expenses that are incurred when individual libraries must
> purchase records because the sharing of those records is prohibited or
> restricted."
>
> * This whole section on increased sharing and eliminating redundancies
> is an opportune place to allude to the potential of open licensing.
>
> p. 12, sect. 1.1.1.1 & 1.1.2.1
>
> "1.1.1.1 All: Be more flexible in accepting bibliographic data from
> others (e.g., publishers, foreign libraries) that do not conform
> precisely to U.S. library standards."
>
> "1.1.2.1 All: Develop workflow and mechanisms to use data and metadata
> from network resources, such as abstracting and indexing services,
> Amazon, IMDb, etc., where those can enhance the user's experience in
> seeking and using information.
>
> * Its likely that some form of liberal licensing is requisite for
> utilising third party data (1.1.1.1) and in re-purposing existing
> metadata (1.1.2.1) on a large scale.
>
> p. 13, sect. 1.1.4
>
> "1.1.4 Re-Examine the Current Economic Model for Data Sharing in the
> Networked Environment
> 1.1.4.1 LC: Convene a representative group consisting of libraries
> (large and small), vendors, and OCLC members to address costs, barriers
> to change, and the value of potential gains arising from greater sharing
> of data, and to develop recommendations for change.
> 1.1.4.2 LC: Promote widespread discussion of barriers to sharing data.
> 1.1.4.3 LC: Reevaluate the pricing of LC's product line with a view to
> developing a business model that enables more substantial cost recovery."
>
> * We strongly suggest that the public good (or the economic notion of
> 'social welfare'), in addition to cost recovery, should be taken into
> account in the analysis of these issues. Particularly given the trend
> setting role it is suggested that LC takes in the wider world of
> bibliographic control.
>
> p. 15, sect. 1.2
>
> "Long-term dependence on Library of Congress bibliographic services
> leaves the users of those services increasingly vulnerable to any
> changes in them.
>
> Long-term reliance on Library of Congress leadership and on its
> provision of cataloging records leads libraries—even some large
> libraries with relatively plentiful staff—to think that they bear no
> responsibility, individually or collectively, for sharing substantively
> in the work of
> bibliographic control."
>
> * Note the same would be true if, for example, more libraries
> outsourced bibliographic work to 'closed' private contractors to replace
> core functions that had previously been fulfilled by LC. It seems that a
> stronger culture of sharing and exchanging data between libraries
> (perhaps in addition to third party contractors and contributions) is a
> more sustainable strategy that would leave libraries in a better
> position in the longer term - and able to do at least some work 'in house'.
>
> p. 16, sect. 1.2
>
> "All types of libraries will contribute to the best of their abilities
> and resources to the "public good" that comes from bibliographic control
> and resource sharing."
>
> * Again, we strongly suggest that this is factored into the kinds of
> discussions and analyses recommended in 1.1.4 (cf. comments on p. 13,
> sect. 1.1.4).
>
> p. 18, sect. 1.3
>
> "There will be increased sharing of authority data between libraries and
> between library systems and systems from other communities, with library
> authority data available to anyone working with bibliographic data.
> Economies will be realized by minimizing the number of times the same
> entity needs to be researched. Exchange of information about the same
> name from one system to another will be made simpler and more reliable.
> Access to data will be unimpeded and barriers to using data will be
> minimized."
>
> * This is another opportune moment to mention the potential benefits of
> open licensing.
>
> pp. 19-20, sect. 2 & p. 21, sect 2.4
>
> "2.4 Encourage Digitization to Allow Broader Access"
>
> * Though, as stated above, our primary interest in these comments is in
> bibliographic metadata, we also advocate making the digitised images of
> material that is in the public domain available under an open license
> where possible.
>
> pp. 21-26, sect. 3
>
> * It would be extremely valuable if LC encouraged all library records
> to have a standard metadata field that that included information on the
> license of the library record itself.
>
> p. 23, sect. 3.1
>
> "Library bibliographic data will move from the closed database model to
> the open Web-based model wherein records are addressable by programs and
> are in formats that can be easily integrated into Web services and
> computer applications. This will enable libraries to make better use of
> networked data resources and to take advantage of the relationships that
> exist (or could be made to exist) among various data sources on the Web."
>
> * Open licensing could greatly help to facilitate the emergence of such
> a 'open' model.
>
> p. 28, sect. 4.1
>
> "Library bibliographic data will be used in a wide variety of
> environments, and interoperability between library and non-library
> bibliographic applications will increase/improve.
>
> Library catalogs are seen as valuable components in an interlocking
> array of discovery tools."
>
> * Again, this is a particularly opportune place to mention the
> possibility of using a liberal license.
>
> p. 31, sect. 4.3.1.2
>
> "4.3.1.2 LC: Provide LCSH openly for use by library and non-library
> stakeholders."
>
> * Ditto.
>
> pp. 33-4, sect. 5.1
>
> * We stronly advocating that the 'public good' be taken into account
> while building an evidence base. (Cf. p. 13, sect. 1.1.4)
>
> == References ==
>
> All page numbers refer to the Draft Final Report of the Working Group
> <http://www.loc.gov/bibliographic-future/news/lcwg-report-draft-11-30-07-final.pdf>.
>
> [1] Letter from the Working Group – November 30, 2007
> <http://www.loc.gov/bibliographic-future/news/lcwg-report-memo-11-30-07.pdf>
>
> [2] According to the Directory of Open Access Journals
> <http://www.doaj.org/> there are now just under 3000 Open Access
> journals with over 160,000 articles. See Open Access News
> <http://www.earlham.edu/~peters/fos/fosblog.html> for more on open
> projects in scholarly publishing and research. OER (Open Educational
> Resources) Commons <http://www.oercommons.org/> is a major portal for
> open course content. Science Commons <http://sciencecommons.org/> is a
> significant proponent of open licensing for scientific research data.
>
> [3] Creative Commons and Talis both maintain open licenses such as the
> Creative Commons Attribution license
> <http://creativecommons.org/licenses/by/2.0/> and the Open Database
> License
> <http://www.opencontentlawyer.com/open-data/open-database-licence/>.
> Another frequently used open license is the GFDL, which Wikipedia's
> content is licensed under. For a more comprehensive list see
> <http://www.opendefinition.org/licenses>.
>
> [4] The Open Library is a prominent project that is currently
> experimenting with versioning in bibliographic data
> <http://www.openlibrary.org/>.
>
> [5] To give an example, many developers are exploring different uses of
> the open-source suite of tools from MIT's Simile project
> <http://simile.mit.edu/>, which allows large datasets to be represented
> on a timeline.
>
> [6] The WC3 Community Project 'Linking Open Data'
> <http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData>
> which includes Tim Berners-Lee is currently pioneering work in this area.
>
>
> Rufus Pollock wrote:
>> Jonathan Gray wrote:
>>> Hi all,
>>>
>>> The Library of Congress has asked for comments on a draft produced by
>>> a Working Group they initiated on the 'Future of Bibliographic Control'.
>>>
>>> As some of you may have seen I recently blogged about this:
>>>
>>> http://blog.okfn.org/2007/12/06/the-future-of-bibliographic-control-and-licensing-policies-for-bibliographic-data/
>>>
>>>
>>> The deadline for public comments is 15th December. I think it would
>>> be great if we could submit some brief notes on the potential
>>> benefits of openly licensing bibliographic data!
>> We should definitely draft something. Would you be happy to put
>> something together and then post it to the list (or on the wiki with a
>> link for the list).
>>
>> Looking through the PR from the LC in your mail (not included here)
>> some potential points to make would be:
>>
>> * Best way to achieve sharing and deliver value for a publicly funded
>> ORG such as LC is to make biblio metadata *openly* available. Why?
>> * More bugfixing, possibilities for 'wiki-like' management of data
>> etc => better quality data
>> * Allows for possibility of distributed data provision and access
>> (reducing load, reducing latency, risk of downtime etc etc)
>> * Better access both in terms of multiple forms/formats and others
>> designing a better interface (see comments in a recent blog post [1])
>> * Possibilities for reuse and recombination with other data sources
>> * Must emphasize we are not talking about their content at this point
>> just the *metadata*
>> * Might also want to point out that for content in which there are no
>> rights problems (i.e. public domain) should make that stuff openly
>> available for same reason.
>> * Overall: for publicly funded bodies open approaches maximize social
>> welfare!
>>
>> [1]:<http://blog.okfn.org/2007/10/31/british-history-online-why-the-restrictions/>
>>
>>
>>
>>> Does anyone know if any groups or individuals have already submitted
>>> comments along these lines?
>> No-one to my knowledge but that might not be saying much ...
>>
>>> Can anyone think of any organisations/individuals who might be
>>> interested in helping out with this?
>> Should obviously contact archive.org/openlibrary. Paul from Talis has
>> already posted so it looks like something is happening there. Might
>> want to also try contacting CC (perhaps Jon Phillips) though the time
>> constraints might be a little tight.
>>
>> ~rufus
>
>
> _______________________________________________
> okfn-discuss mailing list
> okfn-discuss at lists.okfn.org
> http://lists.okfn.org/cgi-bin/mailman/listinfo/okfn-discuss
More information about the okfn-discuss
mailing list