[Bibjson-dev] BibJSON Goals

Jim Pitman pitman at stat.Berkeley.EDU
Thu Jun 2 13:47:48 UTC 2011


Ben, Will, and Charlotte, if interested in this topic please can you register at Bibjson-dev at lists.okfn.org now managed by Mark
so we can get this conversation out of personal email folders.  Mark, Richard, Peter, sorry for duplication, please confirm if you are 
reading the BibJSON list and want your names dropped from the recipients list on this thread.

Peter wrote about http://bibserver.okfn.org/bibjson 

> > I think this page is really valuable. I like the level of abstraction
> combined with pragmatism. 

I agree

> I'm going to add a few more goals (and I think we should strive for about 10 goals).

My responses indicated below.

> * *bibjson should support a wide variety of applications*. (e.g. not be confined to STM, etc.)
> * *bibjson should be compatible with BibTex and Dublin Core*. No new language comes from nowhere. We show the historical progression and also set
> the maximum entry needed for a newcomer.
> * *it should be easy to write tools that read , write and edit bibjson*.

Yes.

> Ideally it should be possible for someone to edit bibjson in a text editor.
> (things like unique ids, numbering/counts, etc. require programs)

I dont think this is a realistic or very useful goal. My experience is that in limited contexts it is
much easier to work in a plain text editor with ad hoc formats like something I have developed with Matthew Watkins
which we call bibtxt and looks like e.g.

@@article 1234
@author David Aldous ; Jim Pitman
@title  the title

and so on with no fussy quotes or backslashes. You just have to kill one double quote by accident in a text editor and
you have screwed up an entire BibJSON file. BibTeX suffers from the same problem.  These ad hoc formats will come and go.
We need something more rigorous for reliable machine processing, and that is what BibJSON is for. So programs can
read and write BibJSON.


> * *The number of optional features in bibjson (syntax) is to be kept to the
> minimum*. Ideally, there would be zero optional features. Optional features
> cause problems because they are not guaranteed to be in any given situation.
> The more optional features there are in a system the more combinations there are for the system and so the more difficult the programming becomes.

I think I agree for syntax. But there should be no limit on optional fields. And we should strongly encourage if not enforce
conventions for simplifying flattening of lists, like consistent use of "; " as a default separator for author lists and the like.

> * *bibjson documents should be human-legible and reasonably clear*.  Someone
> reading your bibjson should be able to make an educated guess about what the data is that's being tagged.
> * *The bibjson design should be prepared quickly*.
> * *The design of bibjson shall be formal and concise*. Only include as many
> (syntax) elements as you need to be clear, not more and not less.
> ** bibjson documents shall be easy to create*. 

All yes

> bibjson is intended to not require a special editor or tool to create. And in fact, most bibjson
> documents can be edited in a text editor like Notepad or TextEdit.

Very bad idea to directly edit BibJSON in a text editor like Notepad or TextEdit, as explained above.
Rather, edit in any environment where you can easily maintain the structure, and then map by script to BibJSON.
There are lots of existing tools for editing BibTeX, and the map from BibTeX to BibJSON is already very stable.
Where there are remaining issues is with character encodings,  support of mapping BibTeX accents to utf-8 which 
should be required for standard BibJSON.  Here we encounter the issue of specifying format in BibJSON e.g.

title
title_tex
title_html

are all variants I have used and I think essential to support.  The main thing is you should  be able to use BibJSON to
immediatley encode bibliographic data whereever it may have come from, and then perform BibJSON-> BibJSON processing to
clean it up.

> * *Terseness in bibjson markup/vocabulary is of minimal importance*. When
> you're creating bibjson vocabulary, *first_name* is better than*
> fname*because it's clearer and more human readable.  While you do want
> to keep elements names short, the shortness should not be at the sacrifice of
> human-readability.

Right. 
>
> I have taken these verbatim from the XML design goals, substituting bibjson
> for XML. These goals have served the development community very well and
> almost all of them have shown they were good decisions. A few related
> specifically to SGML, and in this spirit I think we should refer to BibTeX
> and DC (and HTML) as our antecedants. By doing this we become inclusive for
> those communities.

Yes.

--Jim

> [1]http://www.oreillynet.com/xml/blog/2008/01/the_design_goals_of_xml_1.html





More information about the bibjson-dev mailing list