[openbiblio-dev] BibJSON vs RDF

Wed Feb 8 20:21:41 UTC 2012

Tom Morris <tfmorris at gmail.com> wrote:

> Seems like there's a variety of opinions.  It's not just an idle
> question because it influences how engineering and marketing time (money) is spent.

Agreed, these are important issues, for this reason.
> Considers these five possible goals for BibJSON:
>
> 1. Internal format for BibServer
> 2. API format for BibServer
> 3. New standard for personal bibliographies (ie alternative to BibTeX, RIS, etc)
> 4. New standard for library bibliographic data (ie alternative to
> MARC, MARC.next, LLD RDF)
> 5. All of the above plus format for author data, data set
> descriptions, weather forecasts, etc
> Each of this requires a different level of functionality,
> specification detail, stability, tool support, promotional activity,
> etc and the stability that you'd want for a new library bibliographic
> data format would conflict with the flexibility that one desires of an
> internal-only format.
>
> Personally, I think the right answer is #1 or #2.  

Strong agreement, except I would push just a little further. More below.

> #3 might be achievable with additional investment in specification and some
> combination of additional tooling plus evangelism, but it would
> require a concerted and focused effort.  

Agreed. I have encountered a lot of resistance to early pushing towards #3. However,
I do see #3 as the long term goal, and if we are successful with #1 and #2 we establish
a beachhead for proceeding to #3 with necessary further funding and support in the
next year or two.  Right now we do not have the toolkit to support a credible push to #3,
I think that attempting to develop this toolkit to early would be a strategic mistake:
it would drain resources,  and alienate data providers with business models entrenched in
legacy data formats. Especially, there is no need for us to compete with potentially friendly service providers like Zotero and BibSonomy.

> I think #4 is an unrealistic goal and investment made in achieving it (e.g. convincing 
> national libraries to publish in BibJSON) takes away from other worthwhile goals.  

I agree with this, except I dont think that convincing national libraries to publish in 
BibJSON implies we are promoting BibJSON as a new standard for library bibliographic data,
just that we ask libraries to export their data in a standard we can easily
pipe to BibServer and which does not lose LD features, which would happen if the data were 
squashed into BibTeX or RIS. Mark has already written a BibJSON converter from BNB RDF/XML data.  I think this is time very well spent, and the reward will be the greater if we can 
persuade BL staff to maintain the BibJSON export, rather than someone in our group having to 
do so.
I have no interest in changing how libraries manage their data internally.
I have a big interest in getting hold of their datasets in a way that those datasets
can be openly processed and reused for various purposes. That's all I need BibJSON for,
to be able to process library data if I can get hold of it.

> When I hear people talking about representing data sets or other stuff in BibJSON (ie #5), 
> I just cringe.

I understand why. However, it is within scope of current BibServer capabilities and
planned efforts to use JSON, and whatever schema develops can be called BibJSON, 
to represent the lists of things routinely encountered in BibServer operations,
most obviously lists of datasets, collections, journals, organizations, people, ....
For none of these entities does BibJSON need to aspire to be the ultimate schema for these
things. It just needs to get the job done for BibServer functionality.

> Note that none of this has anything to do with whether JSON is good or
> bad or would be an appropriate carrier for library data.  I love JSON
> and work with it every day, but JSON and BibJSON are two entirely
> different things.  

An interesting point to develop. Mark has written the current version of BibServer
so it is essentially just a JSON server, within some limits on the structure of the JSON,
it is incidental exactly what the JSON represents. We are using JSON internally to represent
features of authors, datasets, users, ,.... To the extent that these are universally
understood attributes of these entities in the biblio sphere, it seems reasonable to develop
BibJSON to accomodate them.

> Whether MARC.next uses a raw JSON carrier or
> RDF/XML carrier or JSON-LD carrier is a) not very interesting and b)
> someone else's problem, in my opinion.  

All agreed.

> Plus, it's something that will
> be decided on library standards timescale (ie decades), so not
> something that's productive to waste time on now.

Agreed.

In summary, I agree with all of  Tom's points, except that I claim there is
little cost and great reward in using the same kind of JSON structures to represent
records for datasets, people, journals, events, .... and whatever other kinds of things we 
find in the biblio world. JSON allows us to easily deal with records
of various kinds, and link them to other records. We are primarily interested in
records of the existence and location of documents. But anything can be represented
as a document on the web, so we easily encompass records for datasets, people, journals, 
events, subjects, photos, ..... and I think we should be doing that within scope of BibJSON.
I have already done so on a modest scale for most of the above mentioned types.
This is all familiar the library world, having authority records of various kinds. These records exist already, I just want to have them out in the open, and available for import/export to and from a BibServer. The schema necessary for that purpose is what we define to be BibJSON.

--Jim

>
> On Wed, Feb 8, 2012 at 5:10 AM, Edmund Chamberlain <emc59 at cam.ac.uk> wrote:
>
> > Aligning systems and cataloguing practices so closely with a single format
> > (Marc21) has shackled us to bad practice and outdated technology and made it
> > very difficult to share data meaningfully.
> >
> > With the Library of Congress planning to move on from Marc21,  choosing just
> > one syntax and carrier format as we move forward would risk the same
> > mistake.
>
> There's no question that MARC is seriously antiquated from a
> technology point of view and that it represents a certain, very
> traditional, view of how cataloging should be done, which may not be a
> good fit with how cataloging happens in the future.
>
> Having said that, I strongly disagree that having a common standard
> for communication is part of the problem.
>
> > In the post Marc world, there will most likely be no single replacement
> > format. To my mind, thats' a good thing.
>
> I think the main difference will be that all of the non-book stuff in
> MARC will come from somewhere else.  There's no need for librarians to
> be spec'ing out how to describe people or organizations or other
> common items.  However, I suspect there'll still be a single set of
> standards for describing core bibliographic data.
>
> > I've done some development work with RDF, its hard to produce and to consume
> > as others have noted here.
>
> I suspect that you're talking about RDF/XML here, but I'm willing to
> bet that when MARC.next is finished, that the complexity of the
> librarians' data model will dwarf any overhead due to RDF, JSON, XML
> or whatever format it's based on.
>
> All I'm proposing is that the group put a stake in the ground as to
> what the tactical use of BibJSON is in supporting the strategic goals
> of the Open Bibliography group.  Actually it's probably really a
> question for the general list, not the dev list.
>
> Tom
>
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev