[openbiblio-dev] [bibserver] changing handling of collections from frontend to get them working properly again now that collections are stored as separate objects (937d3e2)
kcoyle at kcoyle.net
Sun Oct 9 15:49:28 UTC 2011
Quoting Rufus Pollock <rufus.pollock at okfn.org>:
> I wonder if we are misusing "collection" here. For me a collection is
> just like a bibliography, or even more simply, a bunch of works /
> records I want to collect together.
> Collection in the sense of "all issues or this journal or all articles
> in this journal" while they could be represented as collection might
> better be represented in their own right.
I, too, am somewhat flumoxed by the emphasis on collections here. If a
collection is just any group of metadata from a single source, then
it's not a terribly meaningful or useful grouping. If a collection is
a set of metadata that has been chosen for some purpose, then it is
closer to my concept of collection. In the end, however, there will be
metadata records for resources, and those records may be found in
multiple collections. Will the collections be "undone" in the
database, or will the records retain their existence in a collection?
How will the database handle items that are in multiple collections?
And, as Rufus asks, what is the use of collections to users?
This may be one logical way to store some data, but hierarchy tends to
constrain potential services. You do NOT want to have to know the
publisher in order to find the journal, obviously.
In addition, there are publishers and there are publishers. Some are
professional organizations like ACM, but others are mere corporations,
like Elsevier or Nature. There will be folks who have published in
Time Magazine or the New Yorker, and you don't want to exclude that.
I would tend to record this information, probably with a different
data element for professional or governmental organizations as
publishers or sponsors, but not use it as a way to organize the data.
Across different communities there are just too many different
relationships of bodies to publications to make something like this
work. Think broadly about the world of publication.
>> Each level in the publisher/journal/volume/issue hierarchy demands
>> a metadata record, with attributes depending on level in the
>> hierarchy. It should be
>> required that each such metadata record either contains the list of
>> at least identifiers and human-readable titles of its children, or
>> provides a pointer to such a list as a BibJSON dataset, or both.
> Right, this sounds pretty complex :-)
Oy! Let's not make requirements that will discourage or even prevent
input. What information do people actually have on hand when they are
creating the metadata? (And how accurate is it? Probably not even 95%)
BTW, if you want to create records for each level, there are library
records that contain only the publishing pattern for each journal that
has been cataloged. Those pattern records can be used to create a full
set of journal/volume/issue, but I have to warn you that there are
more levels than volume/issue -- the library data allows for 6 (!)
such levels, but they are fully defined, with their display components
(part, number, season, date, whatever) if you want. (I'll try to find
where these records are... they're kind of background data for the
issue predictor systems that allow libraries to know if they've missed
>> JSTOR provides a major example of a secondary collection which
>> cross-cuts primary publishers, and for each journal may only contain
>> parts of the journal, but typically contains whole volumes and issues.
JSTOR digitized whole journal runs, using the archives of the US
libraries. There shouldn't be many gaps in the journals they did
Also, did you note that JSTOR has announced that it is giving open
access to all of its public domain materials?
> I really think we want to focus on the user stories first and then
> decide whether one concept/ / domain object (e.g. "Collection") is
> sufficient for the domain we are trying to cover.
Agreed. I also think we need a wide variety of user stories from
different fields. This group tends toward math and science, and other
disciplines will have different needs. The social sciences and liberal
arts should be included, no?
kcoyle at kcoyle.net http://kcoyle.net
More information about the openbiblio-dev