[openbiblio-dev] BibJSON Validator?

Jim Pitman pitman at stat.Berkeley.EDU
Thu Feb 16 15:34:44 UTC 2012

Mark MacGillivray <mark at odaesa.com> wrote:

> Last year a few of us considered various JSON formats for use in
> bibserver. JSON-LD was one of them. Another is CSL.
> If we have a strong requirement for schemas, validation, 

I think so, for all the reasons Peter gives.
I have found it quite frustrating over the last months to be working on mapping
datasets to BibJSON without knowing exactly what I was aiming at. Operationally,
the present definition is that if it uploads to bibsoup.net without error, its
valid BibJSON. We must to better than that standard, which has wasted many hours of my time already.

A related issue is that we should have a version-controlled copies of however we are
expressing the BibJSON schema, at stable URLS, so that in discussions like this we can refer to them, and 
so the development appears responsible to the outside world.
At some time recently, Mark had this in place, but I can no longer quickly find any full description of the current or
past JSON schemas at http://bibjson.org/
Mark, please can you make the version-controlled list more apparent on bibjson.org?

> then perhaps we should re-consider adopting JSON-LD or CSL.

I think so. We should aim to get the best of both. -LD to ensure ongoing exchange with the LD world,
and CSL to leverage the efforts already made by the CSL community to provide display templates.
Capability for users to specify display templates for their data is essential, but missing from
present bibserver dev as far as I am aware. This is still an unmet requirement for bibserver dev.
I dont see how meeting this requirement can occur until we stabilize on something like an extension of CSL adequate for BibJSON.
And I dont see why we should sacrifice LD-compatibility.

> Given the number of parsers we have now (bibtex and ris, perhaps marc
> (and basic json/csv import)), it would not be hard to commit to one of
> these alternatives at the moment. However if we go beyond the point at which we have wider provision of parsing, it will become harder.

Right. It seems very timely to face this issue now. 

> Is there any reason why we should create our own solutions to these
> same requirements if we can already have them by taking up one of the already available options?

I dont see any, except that we need the best of both. That might be achieved by embedding some least-common-extension
of current BibJSON and CSL. I expect JSON-LD would then be available by bijection from the BibJSON/CSL. That should be a requirement I think.
I suggest devoting some dev hours to investigating feasibility of that.

> We would need to either use the keys specified in CSL and use their
> schema to validate, or use JSON-LD and choose the most appropriate namespace(s) to use as our default.

I think we should try to extend the CSL schema to suit our current requirements, and if possible embed that in  JSON-LD.
Staying close to a widely adopted standard which already has built in validation capabilities seems like the way to go.


> https://github.com/citation-style-language/schema/blob/master/csl-data.json
> http://json-ld.org/spec/latest/
> On Wed, Feb 15, 2012 at 7:09 PM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
> >
> >
> > On Wed, Feb 15, 2012 at 6:40 PM, Tom Morris <tfmorris at gmail.com> wrote:
> >>
> >> On Wed, Feb 15, 2012 at 10:57 AM, Edmund Chamberlain <emc59 at cam.ac.uk>
> >> wrote:
> >> > I've a barebones Perl based parser up as a gist:
> >> >
> >> > https://gist.github.com/1836836
> >> >
> >> > Should accept stdin. JSON seems valid but does
> >> > not upload to bibsoup. Getting a 'unicode' object has no attribute
> >> > 'get'.
> >>
> >> Is there a BibJSON validator available (or planned)?  I'm thinking of
> >> something along the lines of the W3C validators for various types of
> >> markup: http://validator.w3.org/
> >>
> >> Conversely, is there a suite of BibJSON test documents that all
> >> BibJSON parsers should be able to process in a conformant manner a la
> >> https://github.com/json-ld/json-ld.org/tree/master/test-suite
> >>
> >> Tom
> >>
> > I strongly support these ideas. I'm not deliberately offering to provide
> > solutions but I have done a  lot of this in chemistry and found them
> > essential. This includes:
> > * syntactic validation (presumably any JSON parser should do this)
> > * namespace validation (if used)
> > * semantic validation. This requires us to write semantic specifications. I
> > don't know how much we shall want to do. I can see this being valuable for
> > core vocabulary (e.g. "title" vs "titel", allowed siblings, what elements
> > can have lists, objects, etc.). This may include enumerations and value
> > checking.
> > * roundtripping. Can we read in an entry, store it and re-publish it
> > * unit testing. Is entry A sameAs entry B. I imagine that in many cases
> > sibling order is irrelevant. There may also be problems with comparing
> > floats, dates, etc.
> > * Locale and encoding problems. Do different locales emit different lexical
> > representations? Thus 1.234 in UK may be rendered as 1,234 in some other
> > European countries. Reading a date in TimeZone A might change the date in
> > TimeZone B.
> >
> > I'm not saying these have to be done tomorrow, but at some stage we shall
> > have to address them.
> >
> > P.
> >>
> >> _______________________________________________
> >> openbiblio-dev mailing list
> >> openbiblio-dev at lists.okfn.org
> >> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
> >
> >
> >
> >
> > --
> > Peter Murray-Rust
> > Reader in Molecular Informatics
> > Unilever Centre, Dep. Of Chemistry
> > University of Cambridge
> > CB2 1EW, UK
> > +44-1223-763069
> >
> > _______________________________________________
> > openbiblio-dev mailing list
> > openbiblio-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/openbiblio-dev
> >
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev

More information about the openbiblio-dev mailing list