[ckan-discuss] Is CKAN suitable for textual search in a 10Gb dataset?

Hanssens Bart Bart.Hanssens at fedict.be
Tue Mar 25 16:12:20 UTC 2014


perhaps The DataTank is what you are looking for ? http://docs.thedatatank.com/
Pieter knows much more about this open source tool :-)

If not, 10 GB is pretty small according to today's standard, depending on what your requirements are, 
even a simple command line tool like grep could do the trick.

Or you could use Jena + Postgresql (http://www.w3.org/wiki/LargeTripleStores#Jena_with_PostgreSQL_.28200M.29)

Or MongoDB (https://github.com/talis/tripod-php)

Best regards


From: ckan-discuss [ckan-discuss-bounces at lists.okfn.org] On Behalf Of Andrés Martano [andres at inventati.org]
Sent: Tuesday, March 25, 2014 3:18 PM
To: Pieter Colpaert; ckan-discuss at lists.okfn.org
Subject: Re: [ckan-discuss] Is CKAN suitable for textual search in a 10Gb dataset?

Hi Pieter,

Thanks for the answer.

I don't necessarily need SPARQL. The search can be a simple one, like
you can see in many sites: a form to fill with the text you search and a
few other fields to help filter the results (by date and some predefined

I only need to be able to export the data in RDF, but the database
itself can be in any format (as long as it's good for textual search).

You say that CKAN only takes care of the meta-data of datasets? It's not
suitable to search for data inside of them? Neither to visualize/export
parts of the data itself?
Then maybe it's not the right tool for this case... That's sad, I was
hopping to use it.
Do you know another open source project that would fit better?
ckan-discuss mailing list
ckan-discuss at lists.okfn.org
Unsubscribe: https://lists.okfn.org/mailman/options/ckan-discuss

More information about the ckan-discuss mailing list