[okfn-discuss] Submitting comments to the Library of Congress?

Jonathan Gray jonathan.gray at okfn.org
Wed Dec 12 05:59:59 UTC 2007


After meeting with Aaron Swartz of The Open Library, we thought having a 
shorter version might be good to get more signatories before Friday.

Aaron's prepared something here:


Again - comments appreciated.

Do people think it should be more/less strong?



Jonathan Gray wrote:
> Hi all,
> Below is a (rather long) draft response to the LoC draft report:
>   http://www.okfn.org/wiki/FutureOfBibliographicControl
> Any comments would be much appreciated!
> (Its also inline below.)
> Regards,
> Jonathan
> = Open bibliographic data? - comments on draft report by the Working 
> Group on the Future of Bibliographic Control at the Library of Congress =
> 14th December 2007
> '''Rufus Pollock, The Open Knowledge Foundation''' [[BR]]
> '''Jonathan Gray, The Open Knowledge Foundation''' [[BR]]
> '''Peter Suber, The Scholarly Publishing and Academic Resources 
> Coalition''' [[BR]]
> '''Aaron Schwartz, The Open Library'''[[BR]]
> == Introduction ==
> This document is a response to the call for comments on a draft released 
> by the Working Group on the Future of Bibliographic Control on 30th 
> November 2007 [1].
> We think it is laudable that the Working Group have recommended that the 
> Library of Congess takes a more active role in leading the library world 
> into 21st century. Their vision of a bibliographic control ecosystem 
> which is "collaborative, decentralized, international in scope and 
> web-based" (p. 1) is timely.
> However, we are concerned that there is no explicit mention of the 
> potential benefits of open licensing for bibliographic data. Over the 
> past few years, open licensing has facilitated the explosive growth of a 
> 'knowledge commons'. To give a few prominent examples: Open Access 
> journals, Open Educational Resources and Open Data in scientific 
> research [2] have all been enabled by licenses which permit material to 
> be freely re-used and re-distributed [3].
> We believe open licensing would strongly help to catalyse the 
> flourishing of an information ecology for bibliographic data - by 
> allowing and encouraging anyone to share, modify and build on it. Openly 
> licensed bibliographic data would allow users and developers to:
>  * improve the quality of the data by correcting errors, and adding 
> ancillary information;
>  * attempt to harmonise and integrate data that is from multiple 
> sources, in different formats and which adheres to different sets of 
> standards;
>  * use technologies such as wikis and versioning systems to facilitate 
> the collaborative development of data [4];
>  * host bibliographic data and experiment with distributed data 
> provision and access;
>  * combine bibliographic datasets with other material - such as 
> user-contributed reviews, images and 'tags';
>  * build innovative (web) applications to explore and represent the 
> wealth of information contained in bibliographic records, e.g. through 
> datamining and/or visualization technologies [5];
>  * extract structured, machine-readable data from bibliographic records 
> and to link this to other open datasets in the emerging semantic web of 
> data [6].
> New kinds of technologies are emerging very rapidly - and we think that 
> one of the best ways for the library community to see the fruits of 
> these developments applied to bibliographic data is to permit greater 
> experimentation with the data by the wider technical community - and the 
> general public. Placing restrictions on how bibliographic records may be 
> re-used effectively inhibits community-led development and innovative 
> 'tinkering'. One of the implicit principles of more 'open' models of 
> development is that 'the most interesting thing to be done with your 
> material will be thought of by someone else'. This kind of thought 
> resonates strongly with the "decentralised", "dynamic", "collaborative" 
> ethos propagated in the report, in which users and third party 
> organisations are encouraged to play a more active role in bibliographic 
> control.
> == Summary of key comments ==
>  * The potential benefits of open licensing should be mentioned in the 
> draft. We've identified several places where such mention may be 
> appropriate.
>  * The draft should strive to acknowledge a broad spectrum of parties 
> who may contribute to an ecosystem of bibliographic control, and who 
> benefit from shared bibliographic data - including individual technical 
> developers, enthusiasts and a diverse variety of third part 
> organisations - rather than simply either libraries, library users and 
> commercial contractors. (Cf. comments on p. 1, par. 1)
>  * Open licensing can help to lower or remove transaction costs. (Cf. 
> comments on p. 1, par. 1)
>  * We urge that even if value-added data products or services are sold 
> in order to recover costs, openly licensing 'raw' bibliographic data is 
> still considered. (Cf. comments on p. 4, par. 3)
>  * The LC takes into account short and long term opportunities to create 
> 'public value' as well as opportunities for market growth when 
> considering making alterations to its pricing structure. (Cf. comments 
> on p. 8, par. 1; p. 13, sect. 1.1.4)
>  * The report should explicitly acknowledge significant work by 
> non-profit organisations in the areas of digitisation and bibliographic 
> control as well as contributions of commercial vendors. (Cf. comments on 
> p. 8, par. 2)
>  * The Library of Congress should take a leading role in encouraging 
> bibliographic data to be shared - encouraging other individual libraries 
> to make their data available under an open license where possible. (Cf. 
> comments on p. 8, par. 5)
>  * Open bibliographic data would encourage relevant groups to improve 
> and build on each other's work rather than doubling up effort in 
> parallel development. (Cf. comments on p. 9, par. 1)
>  * A strong culture of sharing bibliographic information may help 
> libraries not become over-dependent on third party contractors to 
> replace work currently done by Library of Congress. (Cf. comments on p. 
> 15, sect. 1.2)
>  * The products of digitizing material that is in the public domain 
> should be made available under an open license where possible. (Cf. 
> comments on pp. 19-20, sect. 2; p. 21, sect 2.4)
>  * The Library of Congress should implement changes in metadata 
> standards such that there is be a field within each bibliographic record 
> to specify the license the record is available under (Cf. comments on 
> pp. 21-26, sect. 3.)
> == Comments on the Draft Report ==
> N.B. We take 'bibliographic data' to refer to metadata concerning 
> library holdings - primarily in the form of bibliographic records.
> === Introduction ===
> p. 1, par. 1
> "Its realization will occur in cooperation with the private sector, and 
> with the active collaboration of library users."
>  * The implied distinction - between formal cooperation with the private 
> sector and input from ordinary library users - may become increasingly 
> blurred. We think it would be valuable to recognise that there is 
> potential for a broad spectrum of potential collaborators ranging 
> between these two poles - including individual technical developers and 
> smaller groups who might wish to re-use or add value to bibliographic 
> data without necessarily, e.g., contracting with the relevant producer.
> "Data will be gathered from multiple sources; change will happen 
> quickly; and bibliographic control will be dynamic, not static."
>  * Open licensing would help to ensure that bibliographic control is 
> dynamic and that change happens quickly by eradicating the requirement 
> that every user asks permission from every data producer for each new 
> application of bibliographic information.
> "Libraries must continue the transition to this future without delay in 
> order to retain their relevance as information providers."
>  * As mentioned above, openly licensing bibliographic material would 
> help to accelerate this transition by allowing third parties to 
> experiment with innovative ways of re-using it and building on it - 
> including the development of new kinds of applications, services, 
> plugins, and so on.
> === Background ===
> p. 4, par. 3
> "According to current congressional regulations, LC is permitted to 
> recover only direct costs for services provided to others. As a result, 
> the fees that the Library charges do not cover the most expensive aspect 
> of cataloging: namely, the cost of the intellectual work. . The 
> economics of creating LC's products have changed dramatically since the 
> time when the Library was producing cards for library catalogs. It is 
> now time to reevaluate the pricing of LC's product line in order to 
> develop a business model that allows LC to more substantially recoup its 
> actual costs."
>  * Reevaluating product pricing is arguably one way among several 
> towards cost recovery. Also, while the LC might recoup costs through 
> revenue generated through value-added products and services - we hope 
> this does not preclude any effort to encourage the circulation of its 
> raw data.
> === Guiding Principles ===
> p. 7, par. 3
> "Different communities of bibliographic practice have grown up around 
> different resource types: library collections of books and journals, 
> archives, journal articles, and museum objects and images. As these 
> resources and others become increasingly accessible through the Web, 
> separation of the communities of practice that manage them is no longer 
> desirable, sustainable, or functional. Bibliographic control is 
> increasingly a matter of managing relationships—among works, names, 
> concepts, and object descriptions—across communities. Consistency of 
> description within any single environment, such as the library catalog, 
> is becoming less significant than the ability to make connections 
> between environments: Amazon to WorldCat to Google to PubMed to 
> Wikipedia, with library holdings serving as but one node in this web of 
> connectivity. In today's environment, bibliographic control cannot 
> continue to be seen as limited to library catalogs."
>  * Again, open licensing could be mentioned here, given this projected 
> decentralisation and the importance of widespread collaboration among 
> many different parties.
> p. 8, par. 1
> "Once considered a public good, information access is today a commodity 
> in a rapidly-growing marketplace. Many information resources formerly 
> managed in the not-for-profit sector are now the objects of a 
> significant for-profit economy. Entities in this latter economy have 
> financial capabilities far beyond those of libraries. Further, they have 
> the resources to engage in large scale research and development."
>  * We think its crucial here to strike a balance here between 
> encouraging public benefit and market growth. Open licensed 
> bibliographic data would allow the general public to benefit from being 
> able to freely re-use and re-distribute the it, as well as commercial 
> organisations to benefit from being able to re-use it in their products 
> and services. Increased commercial exploitation would also arguably 
> indirectly generate more revenue for government organisations such as LC 
> through an increase in taxable profits. Open licensing also allows 
> community driven development, which may in some cases yield similar or 
> even preferable results to well funded closed models of development. 
> Also open licensing is becoming increasingly popular for large 
> for-profit enities, who may, for example, charge for associated services.
> p. 8, par. 2
> "Libraries of today need to recognize that they are but one group of 
> players in a vast field, and that market conditions necessitate that 
> libraries interact increasingly with the commercial sector. One example 
> of such interaction can be found in the various mass digitization 
> projects in which for-profit organizations are making use of library 
> resources and library metadata."
>  * It is also important to recognise new partnerships with non-profit 
> organisations in this area - such as the important digitisation work 
> being carried out by the Internet Archive and by The Open Library with 
> members of the Open Content Alliance.
> p. 8, par. 5
> "Sharing, however, is not a strategy for LC alone. The entire library 
> community and its many partners must also be part of it."
>  * Again, by advocating liberal licensing practices on a wide scale - 
> the LC could effectively encourage libraries to scale their 
> bibliographic control operations by sharing their data.
> p. 9, par. 1
> "Is there duplicate effort being expended? Are there possible 
> partnerships that could reduce the burden on the Library?"
>  * Open licensing in this area would encourage relevant groups to 
> improve and build on each other's work rather than doubling up effort in 
> parallel development.
> p. 9, par. 4
> "In addition, the standards landscape in the library field is murky, 
> with many different organizations working on similar standards in a 
> non-coordinated fashion."
>  * See comments on p. 9, par. 1, above.
> === Findings and Recommendations ===
> p. 11, sect. 1.1
> "The Working Group identified three primary areas of redundancy in the 
> bibliographic production process:
>   1. the supply chain, wherein some data are created by publishers and 
> vendors and later re-created by library catalogers;
>   2. the modification of records within the library community, wherein 
> such modifications are not shared, even though they could be useful to 
> others; and
>   3. the expenses that are incurred when individual libraries must 
> purchase records because the sharing of those records is prohibited or 
> restricted."
>  * This whole section on increased sharing and eliminating redundancies 
> is an opportune place to allude to the potential of open licensing.
> p. 12, sect. &
> " All: Be more flexible in accepting bibliographic data from 
> others (e.g., publishers, foreign libraries) that do not conform 
> precisely to U.S. library standards."
> " All: Develop workflow and mechanisms to use data and metadata 
> from network resources, such as abstracting and indexing services, 
> Amazon, IMDb, etc., where those can enhance the user's experience in 
> seeking and using information.
>  * Its likely that some form of liberal licensing is requisite for 
> utilising third party data ( and in re-purposing existing 
> metadata ( on a large scale.
> p. 13, sect. 1.1.4
> "1.1.4 Re-Examine the Current Economic Model for Data Sharing in the 
> Networked Environment
> LC: Convene a representative group consisting of libraries 
> (large and small), vendors, and OCLC members to address costs, barriers 
> to change, and the value of potential gains arising from greater sharing 
> of data, and to develop recommendations for change.
> LC: Promote widespread discussion of barriers to sharing data.
> LC: Reevaluate the pricing of LC's product line with a view to 
> developing a business model that enables more substantial cost recovery."
>  * We strongly suggest that the public good (or the economic notion of 
> 'social welfare'), in addition to cost recovery, should be taken into 
> account in the analysis of these issues. Particularly given the trend 
> setting role it is suggested that LC takes in the wider world of 
> bibliographic control.
> p. 15, sect. 1.2
> "Long-term dependence on Library of Congress bibliographic services 
> leaves the users of those services increasingly vulnerable to any 
> changes in them.
> Long-term reliance on Library of Congress leadership and on its 
> provision of cataloging records leads libraries—even some large 
> libraries with relatively plentiful staff—to think that they bear no 
> responsibility, individually or collectively, for sharing substantively 
> in the work of
> bibliographic control."
>  * Note the same would be true if, for example, more libraries 
> outsourced bibliographic work to 'closed' private contractors to replace 
> core functions that had previously been fulfilled by LC. It seems that a 
> stronger culture of sharing and exchanging data between libraries 
> (perhaps in addition to third party contractors and contributions) is a 
> more sustainable strategy that would leave libraries in a better 
> position in the longer term - and able to do at least some work 'in house'.
> p. 16, sect. 1.2
> "All types of libraries will contribute to the best of their abilities 
> and resources to the "public good" that comes from bibliographic control 
> and resource sharing."
>  * Again, we strongly suggest that this is factored into the kinds of 
> discussions and analyses recommended in 1.1.4 (cf. comments on p. 13, 
> sect. 1.1.4).
> p. 18, sect. 1.3
> "There will be increased sharing of authority data between libraries and 
> between library systems and systems from other communities, with library 
> authority data available to anyone working with bibliographic data. 
> Economies will be realized by minimizing the number of times the same 
> entity needs to be researched. Exchange of information about the same 
> name from one system to another will be made simpler and more reliable. 
> Access to data will be unimpeded and barriers to using data will be 
> minimized."
>  * This is another opportune moment to mention the potential benefits of 
> open licensing.
> pp. 19-20, sect. 2 & p. 21, sect 2.4
> "2.4 Encourage Digitization to Allow Broader Access"
>  * Though, as stated above, our primary interest in these comments is in 
> bibliographic metadata, we also advocate making the digitised images of 
> material that is in the public domain available under an open license 
> where possible.
> pp. 21-26, sect. 3
>  * It would be extremely valuable if LC encouraged all library records 
> to have a standard metadata field that that included information on the 
> license of the library record itself.
> p. 23, sect. 3.1
> "Library bibliographic data will move from the closed database model to 
> the open Web-based model wherein records are addressable by programs and 
> are in formats that can be easily integrated into Web services and 
> computer applications. This will enable libraries to make better use of 
> networked data resources and to take advantage of the relationships that 
> exist (or could be made to exist) among various data sources on the Web."
>  * Open licensing could greatly help to facilitate the emergence of such 
> a 'open' model.
> p. 28, sect. 4.1
> "Library bibliographic data will be used in a wide variety of 
> environments, and interoperability between library and non-library 
> bibliographic applications will increase/improve.
> Library catalogs are seen as valuable components in an interlocking 
> array of discovery tools."
>  * Again, this is a particularly opportune place to mention the 
> possibility of using a liberal license.
> p. 31, sect.
> " LC: Provide LCSH openly for use by library and non-library 
> stakeholders."
>   * Ditto.
> pp. 33-4, sect. 5.1
>  * We stronly advocating that the 'public good' be taken into account 
> while building an evidence base. (Cf. p. 13, sect. 1.1.4)
> == References ==
> All page numbers refer to the Draft Final Report of the Working Group 
> <http://www.loc.gov/bibliographic-future/news/lcwg-report-draft-11-30-07-final.pdf>.
> [1] Letter from the Working Group – November 30, 2007 
> <http://www.loc.gov/bibliographic-future/news/lcwg-report-memo-11-30-07.pdf>
> [2] According to the Directory of Open Access Journals 
> <http://www.doaj.org/> there are now just under 3000 Open Access 
> journals with over 160,000 articles. See Open Access News 
> <http://www.earlham.edu/~peters/fos/fosblog.html> for more on open 
> projects in scholarly publishing and research. OER (Open Educational 
> Resources) Commons <http://www.oercommons.org/> is a major portal for 
> open course content. Science Commons <http://sciencecommons.org/> is a 
> significant proponent of open licensing for scientific research data.
> [3] Creative Commons and Talis both maintain open licenses such as the 
> Creative Commons Attribution license 
> <http://creativecommons.org/licenses/by/2.0/> and the Open Database 
> License 
> <http://www.opencontentlawyer.com/open-data/open-database-licence/>. 
> Another frequently used open license is the GFDL, which Wikipedia's 
> content is licensed under. For a more comprehensive list see 
> <http://www.opendefinition.org/licenses>.
> [4] The Open Library is a prominent project that is currently 
> experimenting with versioning in bibliographic data 
> <http://www.openlibrary.org/>.
> [5] To give an example, many developers are exploring different uses of 
> the open-source suite of tools from MIT's Simile project 
> <http://simile.mit.edu/>, which allows large datasets to be represented 
> on a timeline.
> [6] The WC3 Community Project 'Linking Open Data' 
> <http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData> 
> which includes Tim Berners-Lee is currently pioneering work in this area.
> Rufus Pollock wrote:
>> Jonathan Gray wrote:
>>> Hi all,
>>> The Library of Congress has asked for comments on a draft produced by 
>>> a Working Group they initiated on the 'Future of Bibliographic Control'.
>>> As some of you may have seen I recently blogged about this:
>>> http://blog.okfn.org/2007/12/06/the-future-of-bibliographic-control-and-licensing-policies-for-bibliographic-data/ 
>>> The deadline for public comments is 15th December. I think it would 
>>> be great if we could submit some brief notes on the potential 
>>> benefits of openly licensing bibliographic data!
>> We should definitely draft something. Would you be happy to put 
>> something together and then post it to the list (or on the wiki with a 
>> link for the list).
>> Looking through the PR from the LC in your mail (not included here) 
>> some  potential points to make would be:
>> * Best way to achieve sharing and deliver value for a publicly funded 
>> ORG such as LC is to make biblio metadata *openly* available. Why?
>>   * More bugfixing, possibilities for 'wiki-like' management of data 
>> etc => better quality data
>>   * Allows for possibility of distributed data provision and access 
>> (reducing load, reducing latency, risk of downtime etc etc)
>>   * Better access both in terms of multiple forms/formats and others 
>> designing a better interface (see comments in a recent blog post [1])
>>   * Possibilities for reuse and recombination with other data sources
>> * Must emphasize we are not talking about their content at this point 
>> just the *metadata*
>> * Might also want to point out that for content in which there are no 
>> rights problems (i.e. public domain) should make that stuff openly 
>> available for same reason.
>> * Overall: for publicly funded bodies open approaches maximize social 
>> welfare!
>> [1]:<http://blog.okfn.org/2007/10/31/british-history-online-why-the-restrictions/> 
>>> Does anyone know if any groups or individuals have already submitted 
>>> comments along these lines?
>> No-one to my knowledge but that might not be saying much ...
>>> Can anyone think of any organisations/individuals who might be 
>>> interested in helping out with this?
>> Should obviously contact archive.org/openlibrary. Paul from Talis has 
>> already posted so it looks like something is happening there. Might 
>> want to also try contacting CC (perhaps Jon Phillips) though the time 
>> constraints might be a little tight.
>> ~rufus
> _______________________________________________
> okfn-discuss mailing list
> okfn-discuss at lists.okfn.org
> http://lists.okfn.org/cgi-bin/mailman/listinfo/okfn-discuss

More information about the okfn-discuss mailing list