[openbiblio-dev] query to bibliographica.org

Jim Pitman pitman at stat.Berkeley.EDU
Tue Apr 5 19:12:36 UTC 2011


I see some close parallels with experience in developing 
http://people.bibkn.org/

1) lack of performance of Virtuoso/SPARQL/RDF to provide adequate JSON exports in
response to typical queries
2) inadequate search performance
3) challenges with incompatible data schemas from different sources

We managed eventually to solve these problems, using solr for the search,
and BibJSON http://www.bibkn.org/bibjson/index.html to help bridge from
RDF and various ontologies (BIBO, FOAF ... ) to JSON. Unfortunately, 
I dont understand the solution well enough to advise how to adapt it for 
Bibliographica.  The solution rests to a large extent on a Web Service Framework 
http://openstructs.org/structwsf which may or may not be adaptable/portable to needs of Bibliographica. I hope something may have been learned from the 
people.bibkn.org/ effort which may be transferred to Bibliographica. The
goals are very much the same, to have a number of datasets contributed by
different owners, and to provide a common RDF framework for these datasets with 
adequate JSON export. Anyway, the code and datastructures for people.bibkn.org are 
available for review, and you should be able to get some advice from Fred Giasson 
about these matters. I hope Bibliographica will be flexible enough to
include datasets like those in http://people.bibkn.org/ and to provide
comparable or better performance for data retrieval by webservice. How to get from
here to there? All I can suggest is that the Bibliographica developers might 
take a good look at the functionality achieved by people.bibn.org ,
consider how that functionality was achieved, and think about how to adapt/embed 
that functionality in bibliographica.

--Jim


William Waites <ww at styx.org> wrote:

> * [2011-04-05 18:13:06 +0200] Primavera De Filippi <primavera.defilippi at okfn.org> écrit:
>
> ] (1) Is it possible to directly query the database to only retrieve metadata
> ] concerning those works which fulfill the search criteria, i.e. author /
> ] title / date / country / etc.  If so, how can we do that?
> ] 
> ] (2) Is it possible to get access to the SPARQL wrapper at
> ] bibliographica.org/sparql ?   If so, how can we retrieve JSON output from
> ] there?
>
> The answer to these two is, yes and no. The public SPARQL endpoint is
> disabled due to problems with the Virtuoso server (you can read more
> about that at the page that /sparql redirects you to). It is intended
> to be publicly queriable.
>
> Even if it were, I suspect you wouldn't be satisfied with the format
> of the JSON output it gives, and rightly so. JSON output of RDF things
> is generally a bit hard to work with.
>
> ] (3) do you think you would benefit from a python application that would
> ] convert Bibliographica metadata into JSON format?
>
> Bibliographica does this directly, e.g.,
>
>     http://bnb.bibliographica.org/entry/GB8339761.json
>
> You can inspect the data more easily from
>
>     http://bnb.bibliographica.org/entry/GB8339761.n3
>
> (documentation by way of example)
>
> There is some example of using this here:
>
>     http://openbiblio.net/2011/02/16/getting-bibliographica-content-via-jquery/
>     http://openbiblio.net/2011/02/16/open-bibliographic-data-and-dev8d/
>     http://openbiblio.net/2010/11/22/querying-the-british-national-bibliography/
>
> (the second and third don't actually work now due to the reasons above)
>
> Some different representations for the same data produced when we were
> designing the data model (we chose the BIBO one, more or less, and the
> actual "schema" we use is almost exactly the BL one with some extra 
> BIBO bits:
>
>     http://openbiblio.net/2010/09/10/bibliographic-models-in-rdf/
>
> You are correct that, now that the SPARQL endpoint is out of action,
> there is *no way* to query the database other than to use the web form.
> This is a big problem. So what I'll do is make a variant of
> http://bibliographica.org/search that will respond to the Accept header
> and give you back json. You will do the moral equivalent of
>
>     curl -H "Accept: text/javascript" http://bibliographica.org/search?q=primavera
>
> and it will give you back a simple list of URIs. For each uRI you can
> then do,
>
>     curl -H "Accept: text/javascript" http://bnb.bibliographica.org/entry/GB8339761
>
> to get back the representation of the entity. This works now, only
> the search part is absent.
>
> This is a coarse full-text search with no subtle notions of fields for
> author or title, but should do for a first cut. If we can get the 
> SPARQL endpoint opened up to the public again soon then for more
> sophisticated searches you could use that.
>
> In parallel with that I'll start some docs on the wiki - which will
> be linked from the main bibliographica.org site at the next update
> and will live here: https://bitbucket.org/okfn/bibliographica/wiki/Home
>
> Cheers,
> -w
> -- 
> William Waites                <mailto:ww at styx.org>
> http://river.styx.org/ww/        <sip:ww at styx.org>
> F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45
>
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev




More information about the openbiblio-dev mailing list