[openbiblio-dev] [bibserver] changing handling of collections from frontend to get them working properly again now that collections are stored as separate objects (937d3e2)

Jim Pitman pitman at stat.Berkeley.EDU
Sun Oct 9 14:18:41 UTC 2011

> On Sun, 9 Oct 2011 11:31:45 +0100, Rufus Pollock <rufus.pollock at okfn.org> said:
>     rufus> Why? Why would collections belong to another collection?
>     rufus> (I'm not saying this is a bad idea just I'd like to see the
>     rufus> underlying user story for this ...)

William Waites <ww at styx.org> wrote:
> I don't know Mark's motivation, but I think that journals,
> anthologies, etc are kinds of collections. Since they can themselves
> be part of collections it makes sense. 

Strong agreement from me. 

> Whether this is a special case two-level thing, or we allow users to build matrioshka dolls is another question.

Collections MUST be allowed to have more than two levels. The most obvious example is for journal articles


Especially if the publisher is a small professional society, the publisher level collection is very important: it establishes
the identity of the society, and is a collection the society might be proud to host on its own website in association with BibSoup. 
Respect for such collections must be provided to accomodate the needs of publishers for recognition, and to engage them in contributing
data to BibSoup.
Each level in the publisher/journal/volume/issue hierarchy demands a metadata record, with attributes depending on level in the hierarchy.  It should be
required that each such metadata record either contains the list of at least identifiers and human-readable titles of its children, or
provides a pointer to such a list as a  BibJSON dataset, or both.

OAI-ORE  http://www.openarchives.org/ore/ defines standards for the description and exchange of aggregations of Web resources.
This is too heavy for the needs of BibSoup, but something we must accomodate at least to the extent that it provides useful 
input of collections metadata to BibSoup. Providing OAI-ORE compliant export from BibSoup is something we might be able to get grant support for,
but not something we should attempt without additional funding.
I am not sure how much uptake of OAI-ORE there has been, but I think there are some major nested collections e.g. part or 
all of JSTOR which have been mapped to it.  This should be further investigated.

JSTOR provides a major example of a secondary collection which cross-cuts primary publishers, and for each journal may only contain
parts of the journal, but typically contains whole volumes and issues.

Whatever standard BibSoup adopts for collections MUST be able to accomodate the structure of these existing large high quality nested collections.

We wont immediately be able to drop all of JSTOR metadata into BibSoup, but I know I can do this for parts of JSTOR, especially the
metadata of particular publishers like IMS, Bernoulli, and some others where I have connections.  I am starting to work on IMS data, 
and this could be available within weeks as publisher/journal/volume/issue metadata for upload to BibSoup.  This would provide a good test of collections capability.
http://imstat.org/publications/ for the top level of the hierarchy with the list of journals. Note also the further structure of
IMS Journals and Publications 
IMS Co-sponsored Journals and Publications
IMS Supported Journals
IMS Affiliated Journals
each of which defines a collection. These vary in their integrity and the level of interest for supporting them as a collection in BibSoup.
But there should be no technical obstacle to providing such support.
Other collections I am aware of:  Departmental collections, which naturally split by type e.g. (techreport, book, thesis, article, .... ) and author.
Collections are not always nested. The collection of all works of an author (or even all such works known to some source or collector)
is an important case which cross cuts all the other collections mentioned above.


More information about the openbiblio-dev mailing list