[Bibjson-dev] Railroad diagrams for BibJSON

Tue May 31 17:35:27 UTC 2011

Pretty diagrams. I find them hard to read, but they may help us focus on something like a BibJSON schema.

The diags to raise a fundamental issue though which I've gone back and forth on over the years: whether the
basic form of a BibJSON dataset is an array or an object.
According to the current spec, it is an object, but my recent experience is convincing me more and more this is
a mistake. It is commonplace that we deal with sources which emit a sequence of records, and we need (optionally
and preferably) a master record describing the whole dataset. My recommendation is that we adopt the convention
that every BibJSON dataset is an array of records, and optionally the first record in the array can declare itself
to be a descriptor of the whole array. For rapid assembly of datasets, I find myself over an over again dealing
with datasets whose metadata is adequately indicated by context and a filename. I want those datasets to be called BibJSON
files, even if (as usual) they lack metadata.  This parallels perfectly the structure of a BibTeX file.
If a BibJSON dataset is exposed to the web, it is reasonable to expect/require it to have a metadata record saying what it
is. But it is fundamental that if you collect a bunch of such metadata records from different places, then you would
have another BibJSON dataset, to which standard tools would apply, even if you did not have time to say exactly what
you had collected and when and why. So I see creation  of that metadata record as a potential barrier to adoption.

A related point is how to indicate the existence of a collection. One way to do this is simply by tagging, and I think
we should strongly encourage this.  A tag might or might not have a record associated with it. If it is a much used tag,
it is good practice to provide a meta record for the tag, typically corresponding to some biblio entity. The tag is then
effectively the identifier of the entity.
I have been experimenting with various ways to express this sort of thing in the context of uploading BibJSON datasets to
delicious.com and using the delicious tools (which are extremely good) to manage tags and edit individual records.
I think this is potentially very powerful, to leverage existing resources like Delicious, and possibly others like
BibSonomy, CiteUlike, ..., whatever users may prefer, to enable BibJSON datasets to be easily imported/exported to/from these
bibdata environments. 

Anyway, just to let you know what I am thinking, and encouraging an early discussion of the basic object/array issue for BibJSON.

--Jim

----------------------------------------------
Jim Pitman
Director, Bibliographic Knowledge Network Project
http://www.bibkn.org/

Professor of Statistics and Mathematics
University of California
367 Evans Hall # 3860
Berkeley, CA 94720-3860

ph: 510-642-9970  fax: 510-642-7892
e-mail: pitman at stat.berkeley.edu
URL: http://www.stat.berkeley.edu/users/pitman