[ckan-dev] road map for horizontal scaling?

Fawcett, David (MNIT) David.Fawcett at state.mn.us
Thu Oct 6 01:07:29 UTC 2016


RMX,

Our US state is running CKAN on Postgres.  We currently have about 600 datasets, and we are not anywhere close to being limited by the database.

data.gov has about 190,000 datasets and performs fine.

David.
________________________________
From: ckan-dev [ckan-dev-bounces at lists.okfn.org] on behalf of Ruima E. [ruimaximo at gmail.com]
Sent: Wednesday, October 05, 2016 2:40 PM
To: CKAN Development Discussions
Subject: Re: [ckan-dev] road map for horizontal scaling?

Thank you Tim!
I am asking these questions because I am considering installing a CKAN as a data hub for a city. It seems a very promising ideia but I am concerned that if tomorrow the number of datasets grows and we will need it to be distributed through several machines, the PosgreSQL might be a bottleneck and a headache.
When I think about scale I have in mind the example of Hadoop. If tomorrow the datasets cannot fit one machine, just add one more node, edit a few text files and it works seamless. I am afraid that with PosgreSQL that is not the case, or am I wrong?

Best regards,
RMX


On Wed, Oct 5, 2016 at 8:52 PM, Timothy Giles <timothy.giles at slu.se<mailto:timothy.giles at slu.se>> wrote:

Hi RMX.


I wonder if you can give a concrete example of what you mean by scale? Since this is a dev forum/mailing list, I think it would helpful to quantify your issue(s) / conern(s). There are instances of CKAN with hundred of thousands and millions of datasets, as well as individual datasets being extremely large ('00s GBs).


MvH Tim




________________________________
From: ckan-dev <ckan-dev-bounces at lists.okfn.org<mailto:ckan-dev-bounces at lists.okfn.org>> on behalf of Ruima E. <ruimaximo at gmail.com<mailto:ruimaximo at gmail.com>>
Sent: 05 October 2016 02:40 PM
To: ckan-dev at lists.okfn.org<mailto:ckan-dev at lists.okfn.org>
Subject: [ckan-dev] road map for horizontal scaling?

Hi,

At the moment ckan relies on PostgreSQL as a data store. I was shocked when I found that such nice project relies on a data store that is not suitable to scale. Open data in smart cities is expected to be Big Data and it is expected to scale, jeopardizing the success of the whole initiative in a near future.

Is scaling by using open source technologies part of the  road map for CKAN?

Thank you,
RMX

_______________________________________________
ckan-dev mailing list
ckan-dev at lists.okfn.org<mailto:ckan-dev at lists.okfn.org>
https://lists.okfn.org/mailman/listinfo/ckan-dev
Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20161006/db61e462/attachment-0003.html>


More information about the ckan-dev mailing list