[okfn-help] Adding a new Index to the existing OKFN solr server for bibliographica - queries.

Rufus Pollock rufus.pollock at okfn.org
Tue Apr 26 14:25:22 BST 2011


On 26 April 2011 11:13, Ben O'Steen <bosteen at gmail.com> wrote:
>
> Hi,
>
> I understand that there is already a Solr instance somewhere with some
> capacity, and from various conversations I believe it has indexes for
> CKAN and some 5 or 6 other services.

Yes, that's right it runs on eu4. It's (stub) info page is here:

<http://trac.okfn.org/wiki/SolrService>

We'll add more info as a result of this email :-)

> Rufus (cc'd) has asked me to add an index to this instance to support
> bibliographica, to hold its index of records - I have a few questions
> about this instance in this case!
>
> Index to add:
>
>  - 25 fields, some copies for faceting and combined search
>  - approx 3 million 'docs', with 200+ chars per doc in total
>  - needs to be accessible and updatable from the bibliographica.org
> pylons frontend IP address.

Fine.

> Questions: (VM -> *Java* VM from this point on)
>
>  - Is there capacity (RAM/Heap/etc) for this sort of index on the
> existing solr VM?

It is currently 8GB. We've been running substantial indexes including
for wdmmg/openspending for a while so this *should* be fine.

>  - Single VM (multicore), multiple VM (tomcat-adminned), or combination?
>  - If multicore:
>        - How is the instance set up for multicore? using SolrCore admin
> servlet (ie HTTP API for admin)? or is it using a file directory layout
> system?

@Friedrich: can you answer this?

>  - Which public facing services are supported by this instance and more
> importantly, who would be the people to let know that a large index will
> be added to this?

Nils and Friedrich (in cc). You can use sysadmin at okfn.org as your
direct line to sysadminners if you need it.

>         - ie who should be awake and aware while the index is being made, in
> case it hits Heap/Stack/other OOM errors and an index corruption happens
> elsewhere in the solr instance VM?
>  - which server is running this solr instance and what standard backup
> and maintenance routines are in place for it? eg critical files, /etc
> and indexes rsync'd to remote machine, etc.

At the moment we don't backup the indexes because they can all get
recreated from the source data (albeit a little slowly). If you think
rsync'ing the indexes would be useful we could introduce that.

Rufus



More information about the okfn-help mailing list