[ECODP-dev] Production database size

John Glover john.glover at okfn.org
Mon Sep 30 09:38:02 UTC 2013


Hi Bert,

a) No ckanext-archiver should not be anywhere on the system, it is not
needed. Just having it installed shouldn't cause any problems, but if it is
in the ckan plugins list then it will continue to generate data.

b) In that exact folder you have a datastorer folder? That sounds like
there may be a problem with the build process, that directory should look
like this: https://github.com/okfn/ckan/tree/release-v1.8.1-ecportal

c) This list seems correct for release 01.00.00 (but ecportal_homepage was
not available in release 09).

No, archiver and datastore are not active in release 01.00.00.

Regards,
John


On 30 September 2013 11:21, Bert Van Nuffelen <
bert.van.nuffelen at tenforce.com> wrote:

> Hi John,
>
> some further clarification is required I believe.
>
> *) the dump could be one of release 00.08.0x in which these plugins are
> active.
> *) If it would be a release 00.09.0x version then we need some CKAN
> deployment/management clarification.
>
>    Lets look at our release 01.00.00 (which setup is close to 00.09.00)
>      a) it contains in the $USERHOME/ckan/ecportal/src/ckanext-archiver
> directory
>      b) inside the $USERHOME/ckan/ecportal/src/ckan directory there is the
> datastorer directory
>      c) the ecportal.ini does not contain a reference to datastorer nor
> archiver in the next parameter:
>             ckan.plugins = synchronous_search ecportal ecportal_form
> ecportal_publisher_form ecportal_controller ecportal_multilingual_dataset
> ecportal_homepage multilingual_group multilingual_tag
>
>
>    So
>        * is archiver active in release 01.00.00?
>        * is datastore active in release 01.00.00?
>
> kind regards,
>
> Bert
>
>
>
> 2013/9/26 John Glover <john.glover at okfn.org>
>
>> Hi Bert,
>>
>> No, purge_revision_history and purge_package_extra_revision (spelled with
>> underscores when accessed through the API) are two separate calls. The
>> former purges old data from the resource_revision database table, and the
>> latter removes old data from package_extra_revision. Both tables store old
>> versions of data relating to datasets, and so both should be run regularly
>> due to the large number of daily updates.
>>
>> Regards,
>> John
>>
>>
>> On 26 September 2013 17:06, Bert Van Nuffelen <
>> bert.van.nuffelen at tenforce.com> wrote:
>>
>>> Hi John,
>>>
>>> thanks for the investigation.
>>>
>>> Is the api call
>>>
>>> api.purge=http://localhost:8008/data/api/action/purge_revision_history
>>>
>>> equal to the purge-package-extra-revision paster command?
>>>
>>> kind regards,
>>>
>>> Bert
>>>
>>>
>>> 2013/9/26 John Glover <john.glover at okfn.org>
>>>
>>>> Hi Bert,
>>>>
>>>> I have had a look at the PO database dump. The sizes of the largest
>>>> tables are as follows:
>>>>
>>>> public.kombu_message                           | 15 GB
>>>> pg_toast.pg_toast_78586                        | 3662 MB
>>>> public.task_status                             | 1548 MB
>>>> public.package_extra_revision                  | 811 MB
>>>> public.task_status_entity_id_task_type_key_key | 670 MB
>>>> public.task_status_pkey                        | 458 MB
>>>>
>>>> Apart from package_extra_revision, the other large tables all refer to
>>>> tables used by the old CKAN extensions (such as ckanext-archiver,
>>>> ckanext-datastorer and ckanext-qa). These should not be installed any more.
>>>>
>>>> In the previous release, we supplied a paster command to purge the
>>>> task_status and kombu_message tables:
>>>>
>>>> paster --plugin=ckanext-ecportal ecportal purge-task-data -c <config>
>>>>
>>>> Running this should get rid of most of this unnecessary data (although
>>>> you will probably to do a VACUUM in postgres afterwards to reclaim space).
>>>> This should only have to be run once if the extensions have been correctly
>>>> uninstalled. If these tables continue to grow, then at least one of the
>>>> extensions is still installed. So, make sure that none of the following
>>>> appear in the ckan.plugins config list: archiver, qa, datastorer.
>>>>
>>>> From looking at the timestamps of the task_status entries, it would
>>>> seem at least 1 of these extensions is still installed, as the last write
>>>> was by the datastorer on the 4th September.
>>>>
>>>>  The package revision table is also quite large, but this is expected
>>>> due to the daily updating of all Eurostat packages. This should be
>>>> regularly cleared with the purge-package-extra-revision paster command as
>>>> in previous releases.
>>>>
>>>> Regards,
>>>> John
>>>>
>>>> _______________________________________________
>>>> Ecodp-dev mailing list
>>>> Ecodp-dev at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>
>>>>
>>>
>>>
>>> --
>>> Bert Van Nuffelen
>>>
>>> Semantic Technologies Software Architect at TenForce
>>> www.tenforce.be
>>>
>>> Bert.Van.Nuffelen at tenforce.com
>>> Office: +32 (0)16 31 48 60
>>> Mobile:+32 479 06 24 26
>>> skype: bert.van.nuffelen
>>>
>>> _______________________________________________
>>> Ecodp-dev mailing list
>>> Ecodp-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>
>>>
>>
>> _______________________________________________
>> Ecodp-dev mailing list
>> Ecodp-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>
>>
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen
>
> _______________________________________________
> Ecodp-dev mailing list
> Ecodp-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130930/030a948d/attachment.html>


More information about the ecodp-dev mailing list