[openbiblio-dev] Virtuoso versus 4store

Rufus Pollock rufus.pollock at okfn.org
Thu May 12 15:20:56 UTC 2011


There has been recent discussion with Will Waites about what we use as
our backend for openbiblio.

Rough summary (Will knows more so I am sure he can add):

1. We have used 4store and virtuoso. Both have been quite painful to install.
2. We switch to virtuoso as default ~ 6 months ago
3. We have encountered a show-stopper bug in virtuoso python bindings
3 months ago. This is still not resolved AFAIK.
4. This necessitated rewriting code to use Virtuoso sparql interface.
The problem with this is there is no way in Virtuoso to distinguish
GET from UPDATE/DELETE ops in sparql. We therefore had to shut down
the sparql API.
5. Will experimented with a migration back to 4store a few weeks ago.
We started a production deployment 2 weeks ago but this was halted
because resource usage seemed very high (2x16GB store plus api machine
plus web app machine compared to previous 8GB virtuoso store + 1
machine for webapp). 4store does appear to require a more complex
production environment and to be more demanding of resources.

Question: what do we do?

Secondary question: can we abstract the code so it doesn't care which
backend it is using?

In my opinion we should be cautious about switching away again to 4store:

 * Virtuoso is working
 * We now have extensive (and tested) documentation on installation
and deployment
 * We have experience of Virtuoso working ok.

That said we don't currently have a SPARQL endpoint (if we could
somehow restrict write ops via SPARQL we'd be ok again ...). IMO this
isn't a huge deal *if* we a working solr instance and APIs are
operational but it would be interesting to know what others thought
here.

Rufus




More information about the openbiblio-dev mailing list