[ckan-discuss] question on datastore, filestore and data/metadata storage in CKAN

Elena Camossi elena.camossi at ext.jrc.ec.europa.eu
Wed Nov 13 08:38:43 UTC 2013


Thanks a lot Stéphane for the very detailed explanation!

Kind regards,

-Elena 

 

 

From: Stéphane Guidoin [mailto:stephane at opennorth.ca] 
Sent: martedì 12 novembre 2013 23:22
To: Elena Camossi
Cc: CKAN discuss
Subject: Re: [ckan-discuss] question on datastore, filestore and
data/metadata storage in CKAN

 

Hi Elena,

 

Overall structure:

- DB (postgres) is mainly used to store metadata (dataset, resources) and
the overall structure of CKAN (organizations, groups, users, history of
modifications, etc.)

 

- The "default" behaviour of CKAN is to point the resource where it is.
(thus, a catalog)

 

- Most of the implementations will use the "Filestore" to store locally some
data. When you create a dataset, if the filestore is enabled, you can upload
a file. You can also push some files via the CKAN API.

 

- Harvesters use the default behaviour: collect the metadata and leave the
data wherever it is. As someone said recently, the archiver extension could
be use to download them but I did not try it yet.

 

- If you activate the datastore, the DB (postgres) can be used to store
atomized data (usually it will be in a different DB that the main
database... could even be on a different server.

 

- SOLR leaves it own life.

 

Hope it answers your questions.

 

 

On Tue, Nov 12, 2013 at 11:14 AM, Elena Camossi
<elena.camossi at ext.jrc.ec.europa.eu
<mailto:elena.camossi at ext.jrc.ec.europa.eu> > wrote:

Hi everyone,

I have a basic question that will probably sound silly, but I'm getting more
and more confused on how CKAN organizes physically datasets...

Question is: What is exactly the data/metadata storage model CKAN uses?

To be more clear, are data from datasets always stored with metadata, or
just metadata are locally stored in the CKAN instance, and the dataset can
remain stored somewherelse? (I'm thinking of the case of a harvested
dataset, not to a dataset which is inserted from scratch).
What is it actually stored in the backend postgres database? Just metadata,
and data go eventually to the file system or remain remote? SOLR, in this
architecture, index both data and metadata?

Finally, what is the exact function of the filestore? Is it used to store
locally the data? Or just the metadata?
Does the datastore duplicate data/metadata already included in the
filestore?

Thanks a lot for putting some light on this...

Cheers,
-Elena






_______________________________________________
ckan-discuss mailing list
ckan-discuss at lists.okfn.org <mailto:ckan-discuss at lists.okfn.org> 
http://lists.okfn.org/mailman/listinfo/ckan-discuss
Unsubscribe: http://lists.okfn.org/mailman/options/ckan-discuss





 

-- 

Stéphane Guidoin

Director, Transportation
Open North

514-862-0084

 <http://opennorth.ca> http://opennorth.ca

Twitter: @opennorth / @hoedic

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20131113/f365a580/attachment-0001.htm>


More information about the ckan-discuss mailing list