[ckan-dev] Elastic search datastore

Ross Jones ross at servercode.co.uk
Tue Oct 27 14:11:49 UTC 2015


Hi,

> On 27 Oct 2015, at 11:23, Ben Scott <ben at benscott.co.uk> wrote:
> Thanks very much for your reply! Yep, that’s it, and I also found this old datastore client repo which uses ES https://github.com/okfn/datastore-client.  That does explain why it was removed - I completely agree using ES/Solr as a database isn’t good .  We don’t use the datastore's write/delete capabilities on our portal, it’s read-only with data held in another file or database. So essentially we use the datastore as an index of other resources, but that’s probably not such common use case.  
> It was the “well-indexed” bit which we struggled with :) As we wanted to allow per-field filtering, the UX was slow unless we added an index to every field - and when we did that on a table with 150 columns x 3m rows the DB wasn’t happy!  Using Solr for filtering was much more performant - but yep, replacing the datastore altogether isn’t the way to go, we’ll need to think a bit more about our approach here.

I think it depends on how you want people to allow people access it. Perhaps you could break the 150 columns into several tables, and make it easy to join?  Can you enforce LIMITs (a quick EXPLAIN on the query will tell you if there is already a limit or not)? Is it *really* too big to just all go in RAM?  

Do you have an example file that you're looking to use that I could have a peek at, just out of general interest/nosiness?  

Cheers

Ross






More information about the ckan-dev mailing list