[ECODP-dev] release 00.09.00 improvements

Darwin Peltan darwin.peltan at okfn.org
Fri Jul 12 08:08:33 UTC 2013

Hi Bert,

Thanks for confirming that the ingestion performance wasn't due to CKAN or
any of the code we provided for the last release and I'm glad we were able
to help you identify the error in your script.

Just to clarify that this bug meant that up to 20k resources were being
created each time a new dataset was created. This obviously slowed the
import significantly as each dataset create call also asked for so many
resources to be created (by the end of the import the database contained 20
million resources).

Whilst this bug caused the import to get progressively slower (as the
number of resources to be added to each new dataset increased), CKAN coped
very well with this unusual situation.



---------- Forwarded message ----------
From: *Bert Van Nuffelen*
Date: Thursday, July 11, 2013
Subject: [ECODP-dev] release 00.09.00 improvements
To: "ZAJAC Agnieszka (OP)" <Agnieszka.ZAJAC at publications.europa.eu>, "HOHN
Norbert (OP)" <Norbert.HOHN at publications.europa.eu>, "MEYER André (OP-EXT)"
<Andre.MEYER at ext.publications.europa.eu>, "ISOARD Olivier (OP)" <
Olivier.ISOARD at publications.europa.eu>
Cc: Project list for EC ODP CKAN project <ecodp-dev at lists.okfn.org>


in respons of the problems reported on monday 08 july 2013 and earlier for
release 00.09.00
we have uploaded to the svn trunk


the next improved software packages:

* ecodp-cubeviz-2-1.noarch.rpm
    - remove dependency on ecodp-virtuoso-odbc-driver package
* ecodp-httpd-template-2-1.x86_64.rpm
    - add dependency on the mod_wsgi apache module
* ecodp-ckan-support-2-1.noarch.rpm
    - updated the download_*rdf scripts to put the log & temp files in
    - indicate backup_restore.conf as a config file for the rpm
* ecodp-rdf2ckan-2-1.noarch.rpm
    - fix a bug where the resources of dataset N where added to the dataset
N+1 on commit to CKAN.
      This turns the behavior of the portal to the situation of release

      The performance problem experienced in the data tab has to do with
the number of associated resources per dataset. If these become large then
noticeable performance degradation is experienced. This bug in RDF2CKAN
acted in this sense as a not intended stress-test for this case.

The packages ecodp-ckan-support and ecodp-rdf2ckan can be deployed over a
configured situation by deploying the new rpm.

best regards,


Bert Van Nuffelen

Semantic Technologies Software Architect at TenForce

Bert.Van.Nuffelen at tenforce.com
Office: +32 (0)16 31 48 60
Mobile:+32 479 06 24 26
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130712/ffa10ea1/attachment.html>

More information about the ecodp-dev mailing list