[openbiblio-dev] [bibserver] changing handling of collections from frontend to get them working properly again now that collections are stored as separate objects (937d3e2)

Sun Oct 9 15:10:15 UTC 2011

On 9 October 2011 15:18, Jim Pitman <pitman at stat.berkeley.edu> wrote:
>
>> On Sun, 9 Oct 2011 11:31:45 +0100, Rufus Pollock <rufus.pollock at okfn.org> said:
>>
>>     rufus> Why? Why would collections belong to another collection?
>>     rufus> (I'm not saying this is a bad idea just I'd like to see the
>>     rufus> underlying user story for this ...)
>
> William Waites <ww at styx.org> wrote:
>> I don't know Mark's motivation, but I think that journals,
>> anthologies, etc are kinds of collections. Since they can themselves
>> be part of collections it makes sense.
>
> Strong agreement from me.
>
>> Whether this is a special case two-level thing, or we allow users to build matrioshka dolls is another question.
>
> Collections MUST be allowed to have more than two levels. The most obvious example is for journal articles

I wonder if we are misusing "collection" here. For me a collection is
just like a bibliography, or even more simply, a bunch of works /
records I want to collect together.

Collection in the sense of "all issues or this journal or all articles
in this journal" while they could be represented as collection might
better be represented in their own right.

> publisher/journal/volume/issue
>
> Especially if the publisher is a small professional society, the publisher level collection is very important: it establishes
> the identity of the society, and is a collection the society might be proud to host on its own website in association with BibSoup.
> Respect for such collections must be provided to accomodate the needs of publishers for recognition, and to engage them in contributing
> data to BibSoup.
> Each level in the publisher/journal/volume/issue hierarchy demands a metadata record, with attributes depending on level in the hierarchy.  It should be
> required that each such metadata record either contains the list of at least identifiers and human-readable titles of its children, or
> provides a pointer to such a list as a  BibJSON dataset, or both.

Right, this sounds pretty complex :-) I strongly suggest sitting down
and writing out in detail the user stories here using our existing
spreadsheet (or creating a new document). Doing user stories would
also focus us on what people actually want to do rather than focusing
on the details of the modelling which would come out of the user
stories rather than the other way round).

> OAI-ORE  http://www.openarchives.org/ore/ defines standards for the description and exchange of aggregations of Web resources.
> This is too heavy for the needs of BibSoup, but something we must accomodate at least to the extent that it provides useful
> input of collections metadata to BibSoup. Providing OAI-ORE compliant export from BibSoup is something we might be able to get grant support for,
> but not something we should attempt without additional funding.
> I am not sure how much uptake of OAI-ORE there has been, but I think there are some major nested collections e.g. part or
> all of JSTOR which have been mapped to it.  This should be further investigated.

Are you volunteering to do this research :-)

> JSTOR provides a major example of a secondary collection which cross-cuts primary publishers, and for each journal may only contain
> parts of the journal, but typically contains whole volumes and issues.
>
> Whatever standard BibSoup adopts for collections MUST be able to accomodate the structure of these existing large high quality nested collections.

Again we need user stories and use cases here with sufficient detail.

> We wont immediately be able to drop all of JSTOR metadata into BibSoup, but I know I can do this for parts of JSTOR, especially the
> metadata of particular publishers like IMS, Bernoulli, and some others where I have connections.  I am starting to work on IMS data,
> and this could be available within weeks as publisher/journal/volume/issue metadata for upload to BibSoup.  This would provide a good test of collections capability.

Great.

> see
> http://imstat.org/publications/ for the top level of the hierarchy with the list of journals. Note also the further structure of
> IMS Journals and Publications
> IMS Co-sponsored Journals and Publications
> IMS Supported Journals
> IMS Affiliated Journals
> each of which defines a collection. These vary in their integrity and the level of interest for supporting them as a collection in BibSoup.
> But there should be no technical obstacle to providing such support.
> Other collections I am aware of:  Departmental collections, which naturally split by type e.g. (techreport, book, thesis, article, .... ) and author.
> Collections are not always nested. The collection of all works of an author (or even all such works known to some source or collector)
> is an important case which cross cuts all the other collections mentioned above.

I really think we want to focus on the user stories first and then
decide whether one concept/ / domain object (e.g. "Collection") is
sufficient for the domain we are trying to cover.

Rufus