[okfn-help] catalogue rdf redux...

Mon Jul 19 15:43:57 BST 2010

On 10-07-19 09:00, Rob Styles wrote:
> I don't understand quite what local data is being created and where. I
> have an rdflib directory now, from adding the ordf entries. I guess
> what I'm wondering is, if I loaded wordnet using the example commands,
> do I need to clear anything out before running for hmg.ini?
>   

The ordf.* and associated configuration lines in your .ini file
define a modular triplestore. The triplestore is accessed through
self.handler which is an instance of ordf.handler.Handler [0].
If you call its put(graph) method, it will save the graph to the
store. If you create a change context as in,

    ctx = self.handler.context("username", "i made a change")
    ctx.add(graph)
    ctx.add(graph2)
    ctx.commit()

It will save the graphs to the store along with a ChangeSet that
describes any changes relative to any previous versions stored
(similarly to what the Talis server does). I have just added a
command line switch, -n, to disable the changesets since you
probably don't want them for your current purposes.

It is saving them in the directory called rdflib which is a
Sleepycat database because that is what is configured in your
config file.

Saving the lenses (is that what you mean by ordf entries?) is
done in the same way. The web interfaces uses the lenses for
displaying the data but you probably don't want to send them
off to the HMG Talis store.

> The second is about getting ntriples and nquads out. The command you
> noted to dump gives us n3 and there is a config line in myconfig.ini
> for that. Now that the output is being done by ORDF, is there a config
> setting for ntriples and nquads?
>   

As I've said, the best thing to do would be to make a back-end
class to talk directly to the Talis store rather than saving the
data locally. This would also help anyone that wanted to use
the Talis community storage with the ordf library and should
be relatively easy. Patches welcome!

The -o option is rather dumb and just serialialises the entire
local store, and only works with the rdflib back-end (this is
why you might want to disable the changesets and not load
the lenses).

The serialisation is done by rdflib (ordf is built on top of that)
and you can give any serialisation that it supports either on
the command line (-f) or in the config file. The supported
values are, off the top of my head,

    xml, n3, nt, turtle, trix, trig

I could write a nquad serialiser for you if you wanted, it
would be pretty easy, but the HMG convention for graph
names means it mighn't help you terribly much. The best
thing to do would be to serialise it as nt and then run the
dump through a sed script:

    < dump.n3 sed -e 's@ \.$@ <http://upload.data.gov.uk/....> .@'

HTH,
-w

[0] http://packages.python.org/ordf/message_handling.html

-- 
William Waites           <william.waites at okfn.org>
Mob: +44 789 798 9965    Open Knowledge Foundation
Fax: +44 131 464 4948                Edinburgh, UK

RDF Indexing, Clustering and Inferencing in Python
		http://ordf.org/