[openbiblio-dev] Bibliographic Metadata Guide is now on Wiki !

Jim Pitman pitman at stat.Berkeley.EDU
Tue Nov 8 15:12:11 UTC 2011


Primavera De Filippi <primavera.defilippi at okfn.org> wrote:

> Hi Jim, and thanks for the updates !
> I integrated your comments into the wikibook at
> http://en.wikibooks.org/wiki/Bibliographic_Metadata_Guide/Metadata_Elements
> (the etherpad has now been deprecated, so from now on it's better to just edit the wiki directly).

OK, thanks. I just indicated as much in the etherpad. For the future, if an etherpad is deprecated
and its stuff is migrated  elsewhere, please leave indication of that at the top of the pad.

> All, do you think the list of minimum metadata elements for literary works
> is sufficiently accurate & detailed ? Shall we move on to different types of works? e.g. images, movies, sounds ?

I would stay away from these difficult and complex data types for now, and focus on end-to-end delivery of
metadata from book and journal data sources to naive users for some significant book collections.  This means mapping some native catalog standard to BibJSON
(or other plain text equivalent BibX, I hardly care which, following which we will map BibX to BibJSON) and
then demonstrating that the  metadata can be successfully piped into a BibServer application for display and redistribution.
In particular, a lot of journal data will be accommodated simply by mapping the NLM DTD to BibJSON.

We should try to keep the number of BibX formats we support to a minimum. Best case for us is a direct map from native
catalog format to BibJSON which is supported by the data provider. Mark and I can advise how to do that, but it would
be good to have other volunteers willing to 
-- help us develop and maintain the BibJSON spec to assure it is capable of accomodating this use case
-- do documentation, coding and handholding for getting data out of legacy formats and database tables and into BibJSON.

Short term, the action most obviously following on from the recent metadata elements selection effort would
be to see that these are all adequately represented in standard ontologies mostly (DC, BIBO) for 
1) rigorous inclusion in BibJSON with proper namespaces and 
2) mapping to RDF and/or  JSON-LD if anyone has the cycles to do that.
I think it is important to execute on the (possibly challenging) step 1) for books and journals before treatment of other
doc types. Step 2) should be easy if 1) is done properly.

There is the issue of exactly how/where these optional/recommended/.... R/NR fit relative to the BibJSON spec. I think
they are something like an "application profile" relative to DC. The application we have in mind is a typical cataloger or publisher
opening up their metadata. But we have yet to formalize this. In any case, it seems from
the perspective of general BibServer dev, that these attributes (except maybe R/NR) should not be built into the BibJSON spec, 
but kept somehow separate. Thoughts about how to manage this?


--Jim

> >
> > I added a few points too. I'm not clear on the relation of this etherpad to
> > Primavera's very professional looking  Wiki setup.
> > The structure of this page parallels closely the structure of e.g.
> > http://en.wikipedia.org/wiki/BibTeX
> > especially with respect to the required/optional nature of the elements
> > associated with various
> > types of document.
> >
> > These elements should be defined independent of type, to the greatest
> > extent possible, though often
> > meanings are understood by type, e.g. the ISBN field of a book chapter is
> > an identifier of the book not the chapter.
> >
> > It is a question where exactly do the declarations:
> >
> > O - optional
> > MA - mandatory if applicable, but may be legitimately missing
> > M - Mandatory
> > R - repeatable
> > NR - not repeatable
> >
> > live relative to e.g. a JSON or XML schema that defines what machines can
> > make of such data.
> > The R/NR classification is structural, and likely part of the schema.  A
> > metadata doc which did not meet the
> > R/NR requirements would be invalid.  The O/MA/M status is not structural,
> > but part of some best practice. Machines should help us deal
> > with bad practice.
> > I think main thing is that when data providers offer to publish metdata we
> > give them a way to easily express all that they have,
> > without undue burden.  I would be inclined to drop the M word, and go with
> > "recommended" or "strongly recommended" for most elements
> > presently marked as M.
> >
> >
> > --Jim
> >
> > ----------------------------------------------
> > Jim Pitman
> > Professor of Statistics and Mathematics
> > University of California
> > 367 Evans Hall # 3860
> > Berkeley, CA 94720-3860
> >
> > ph: 510-642-9970  fax: 510-642-7892
> > e-mail: pitman at stat.berkeley.edu
> > URL: http://www.stat.berkeley.edu/users/pitman
> >




More information about the openbiblio-dev mailing list