[openbiblio-dev] BNB 'export' from bibliographica - how about a conversion instead?

William Waites ww at styx.org
Thu Feb 24 15:53:43 UTC 2011


Hi Ben, the only slightly refactored version of your same script is
here: https://bitbucket.org/okfn/jiscobib/src/3140ce2dc6b1/jiscobib/blstore.py
it includes my small changes to your output and also goesand shoves 
it into the database, one graph per entry.

This is perhaps not a bad idea to do it thi way. The only problem
is, since it's really a text/xml munging script and not an RDF 
thing is that you'll not get the graphs that contain the entries
or at best you'll get one XML file per graph which will be some 3
million or so small files... We could get this same effect by
just using the canned procedure from virtuoso, 
http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfdumpandreloadgraphs

An alternative approach, is just to slightly modify that canned
procedure so that it dumps more than one graph into a file. The
dump would lose graph iformation which is why I've resisted this
approach, but given the delays and problems it might be better
to just do this...

Cheers,
-w

* [2011-02-23 13:15:08 +0000] Ben O'Steen <bosteen at gmail.com> écrit:

] A while ago, I'd prepared a script to convert the BNB original data,
] into something usable, breaking up birth/death dates, giving hashed
] URIs to authors and so on. The output from this code does somewhat
] look like the RDF held in bnb.bibliographica, but the hashes are
] altered (changed into sameas uris with a less unique number to enable
] linkage across records based on just name and dates) and I recall
] something being done to the issue dates, ISBNs and also the removal of
] the legacy DC namespace, which was being used to house a more semantic
] date structure (as dcterms:issued had be unhelpfully constrained to
] only allowing literals)
] 
] If I could understand what the changes were, we could adapt the script
] to do a conversion of the original data into a form as we might expect
] from an export of the live system in short order.
] 
] Ben
] 
] _______________________________________________
] openbiblio-dev mailing list
] openbiblio-dev at lists.okfn.org
] http://lists.okfn.org/mailman/listinfo/openbiblio-dev

-- 
William Waites                <mailto:ww at styx.org>
http://river.styx.org/ww/        <sip:ww at styx.org>
F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45




More information about the openbiblio-dev mailing list