[ECODP-dev] Production database size

John Glover john.glover at okfn.org
Mon Sep 30 10:04:15 UTC 2013


Hi Bert,

ckanext-archiver should not be there.

datastore is part of CKAN core so that is fine (it is disabled in the
config), and it is distinct from ckanext-datastorer (bad naming on our
part).

Regards,
John


On 30 September 2013 11:59, Bert Van Nuffelen <
bert.van.nuffelen at tenforce.com> wrote:

> Hi John,
>
> for b)
>
> -bash-4.1$ ls
> ckan  ckanext-archiver  ckanext-ecportal  ckanext-ecportal-release-v1.8.1
> INSTALLED  vdm
> -bash-4.1$ find . -name "datastore"
> ./ckan/ckanext/datastore
> -bash-4.1$ pwd
> /applications/ecodp/users/ecodp/ckan/ecportal/src
>
>
> Bert
>
>
> 2013/9/30 John Glover <john.glover at okfn.org>
>
>> Hi Bert,
>>
>> a) No ckanext-archiver should not be anywhere on the system, it is not
>> needed. Just having it installed shouldn't cause any problems, but if it is
>> in the ckan plugins list then it will continue to generate data.
>>
>> b) In that exact folder you have a datastorer folder? That sounds like
>> there may be a problem with the build process, that directory should look
>> like this: https://github.com/okfn/ckan/tree/release-v1.8.1-ecportal
>>
>> c) This list seems correct for release 01.00.00 (but ecportal_homepage
>> was not available in release 09).
>>
>> No, archiver and datastore are not active in release 01.00.00.
>>
>> Regards,
>> John
>>
>>
>> On 30 September 2013 11:21, Bert Van Nuffelen <
>> bert.van.nuffelen at tenforce.com> wrote:
>>
>>> Hi John,
>>>
>>> some further clarification is required I believe.
>>>
>>> *) the dump could be one of release 00.08.0x in which these plugins are
>>> active.
>>> *) If it would be a release 00.09.0x version then we need some CKAN
>>> deployment/management clarification.
>>>
>>>    Lets look at our release 01.00.00 (which setup is close to 00.09.00)
>>>      a) it contains in the $USERHOME/ckan/ecportal/src/ckanext-archiver
>>> directory
>>>      b) inside the $USERHOME/ckan/ecportal/src/ckan directory there is
>>> the datastorer directory
>>>      c) the ecportal.ini does not contain a reference to datastorer nor
>>> archiver in the next parameter:
>>>             ckan.plugins = synchronous_search ecportal ecportal_form
>>> ecportal_publisher_form ecportal_controller ecportal_multilingual_dataset
>>> ecportal_homepage multilingual_group multilingual_tag
>>>
>>>
>>>    So
>>>        * is archiver active in release 01.00.00?
>>>        * is datastore active in release 01.00.00?
>>>
>>> kind regards,
>>>
>>> Bert
>>>
>>>
>>>
>>> 2013/9/26 John Glover <john.glover at okfn.org>
>>>
>>>> Hi Bert,
>>>>
>>>> No, purge_revision_history and purge_package_extra_revision (spelled
>>>> with underscores when accessed through the API) are two separate calls. The
>>>> former purges old data from the resource_revision database table, and the
>>>> latter removes old data from package_extra_revision. Both tables store old
>>>> versions of data relating to datasets, and so both should be run regularly
>>>> due to the large number of daily updates.
>>>>
>>>> Regards,
>>>> John
>>>>
>>>>
>>>> On 26 September 2013 17:06, Bert Van Nuffelen <
>>>> bert.van.nuffelen at tenforce.com> wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> thanks for the investigation.
>>>>>
>>>>> Is the api call
>>>>>
>>>>> api.purge=http://localhost:8008/data/api/action/purge_revision_history
>>>>>
>>>>> equal to the purge-package-extra-revision paster command?
>>>>>
>>>>> kind regards,
>>>>>
>>>>> Bert
>>>>>
>>>>>
>>>>> 2013/9/26 John Glover <john.glover at okfn.org>
>>>>>
>>>>>> Hi Bert,
>>>>>>
>>>>>> I have had a look at the PO database dump. The sizes of the largest
>>>>>> tables are as follows:
>>>>>>
>>>>>> public.kombu_message                           | 15 GB
>>>>>> pg_toast.pg_toast_78586                        | 3662 MB
>>>>>> public.task_status                             | 1548 MB
>>>>>> public.package_extra_revision                  | 811 MB
>>>>>> public.task_status_entity_id_task_type_key_key | 670 MB
>>>>>> public.task_status_pkey                        | 458 MB
>>>>>>
>>>>>> Apart from package_extra_revision, the other large tables all refer
>>>>>> to tables used by the old CKAN extensions (such as ckanext-archiver,
>>>>>> ckanext-datastorer and ckanext-qa). These should not be installed any more.
>>>>>>
>>>>>> In the previous release, we supplied a paster command to purge the
>>>>>> task_status and kombu_message tables:
>>>>>>
>>>>>> paster --plugin=ckanext-ecportal ecportal purge-task-data -c <config>
>>>>>>
>>>>>> Running this should get rid of most of this unnecessary data
>>>>>> (although you will probably to do a VACUUM in postgres afterwards to
>>>>>> reclaim space). This should only have to be run once if the extensions have
>>>>>> been correctly uninstalled. If these tables continue to grow, then at least
>>>>>> one of the extensions is still installed. So, make sure that none of the
>>>>>> following appear in the ckan.plugins config list: archiver, qa, datastorer.
>>>>>>
>>>>>> From looking at the timestamps of the task_status entries, it would
>>>>>> seem at least 1 of these extensions is still installed, as the last write
>>>>>> was by the datastorer on the 4th September.
>>>>>>
>>>>>>  The package revision table is also quite large, but this is
>>>>>> expected due to the daily updating of all Eurostat packages. This should be
>>>>>> regularly cleared with the purge-package-extra-revision paster command as
>>>>>> in previous releases.
>>>>>>
>>>>>> Regards,
>>>>>> John
>>>>>>
>>>>>> _______________________________________________
>>>>>> Ecodp-dev mailing list
>>>>>> Ecodp-dev at lists.okfn.org
>>>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Bert Van Nuffelen
>>>>>
>>>>> Semantic Technologies Software Architect at TenForce
>>>>> www.tenforce.be
>>>>>
>>>>> Bert.Van.Nuffelen at tenforce.com
>>>>> Office: +32 (0)16 31 48 60
>>>>> Mobile:+32 479 06 24 26
>>>>> skype: bert.van.nuffelen
>>>>>
>>>>> _______________________________________________
>>>>> Ecodp-dev mailing list
>>>>> Ecodp-dev at lists.okfn.org
>>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Ecodp-dev mailing list
>>>> Ecodp-dev at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>
>>>>
>>>
>>>
>>> --
>>> Bert Van Nuffelen
>>>
>>> Semantic Technologies Software Architect at TenForce
>>> www.tenforce.be
>>>
>>> Bert.Van.Nuffelen at tenforce.com
>>> Office: +32 (0)16 31 48 60
>>> Mobile:+32 479 06 24 26
>>> skype: bert.van.nuffelen
>>>
>>> _______________________________________________
>>> Ecodp-dev mailing list
>>> Ecodp-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>
>>>
>>
>> _______________________________________________
>> Ecodp-dev mailing list
>> Ecodp-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>
>>
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen
>
> _______________________________________________
> Ecodp-dev mailing list
> Ecodp-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130930/a5b6b6e1/attachment.html>


More information about the ecodp-dev mailing list