[openbiblio-dev] query to bibliographica.org

Thu Apr 7 22:02:22 UTC 2011

* [2011-04-05 12:12:36 -0700] Jim Pitman <pitman at stat.Berkeley.EDU> écrit:

] We managed eventually to solve these problems, using solr for the search,
] and BibJSON http://www.bibkn.org/bibjson/index.html to help bridge from
] RDF and various ontologies (BIBO, FOAF ... ) to JSON. 

So the question here is, do we use BibJSON directly or use its mapping
to RDF? Since the storage is in terms of RDF I think we use its
mapping.  But in that case, as consensus builds on how to represent
RDF in JSON generally in a useable way - and there is active effort
happening right now at the W3C for how to do this which I am involved
with - why don't we just use that directly? In principle we gain
compatibility and extensibility that way. Since bibliographica is not
really about books but is about higher order relationships between
books and people, relationships that are not well defined and will
almost always be open to several interpretations, I think
extensibility is crucial.

] I hope Bibliographica will be flexible enough to
] include datasets like those in http://people.bibkn.org/ and to provide
] comparable or better performance for data retrieval by webservice.

Now this is pretty easy, in principle. One thing missing is content type
negotiation:

    curl -H Accept:application/rdf+xml http://people.bibkn.org/wsf/datasets/mathscinet_mrauth/363517

should give me back RDF (there is an RDF/XML export widget on the web
page) but gives me back an HTML page. If that is added, then individuals
can trivially be added, and they can already be referenced from
Bibliographica. This is if it is to be done piecemeal, like some sort
of cross between authorclaim and what we now have, "add so and so that
I found on people.bibkn.org as an author of this book" or "add the
people.bibkn.org URI as an equivalent identifier for this author".

Doing a bulk import would be easy enough as well, though I note that
while our presentation of books is in reasonable shape, the presentation
of authors still uses the old fresnel technique which is not so good...

I also notice that, looking at the JSON output of people.bibkn.org it
looks a lot like the SPARQL results format or something closely related
to some of the RDF-JSON variants and so has some of the same problems as
the regular JSON output from SPARQL endpoints do

] How to get from
] here to there? All I can suggest is that the Bibliographica developers might 
] take a good look at the functionality achieved by people.bibn.org ,
] consider how that functionality was achieved, and think about how to adapt/embed 
] that functionality in bibliographica.

I would not like to bulk copy the data from people.bibkn.org because I
think it is important that we think in terms of a distributed system
where it is quite reasonable for data to live in multiple places. We
can copy small amounts of data, e.g. people's names, as an
optimisation but really we should not mint separate identifiers for
them and allow them to live elsewhere. Ideally the people themselves
might manage thier own profile somewhere and we would like to use that
as their identifier.  So I'd say that one thing we can do is work on a
query protocol where we can search for people e.g. by name and get
their identifiers back, not unlike the search mechanism I just wrote
about in another mail in this thread. I would imagine this mechanism
to be used by the web ui javascript and not particularly by the server
side software, and they would be knitted together similarly to the way
wikipedia gadgets work. I'm not sure how understandable that was, but
there it is.

Cheers,
-w
-- 
William Waites                <mailto:ww at styx.org>
http://river.styx.org/ww/        <sip:ww at styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45