[ECODP-dev] Production database size

Bert Van Nuffelen bert.van.nuffelen at tenforce.com
Mon Sep 30 09:59:02 UTC 2013


Hi John,

for b)

-bash-4.1$ ls
ckan  ckanext-archiver  ckanext-ecportal  ckanext-ecportal-release-v1.8.1
INSTALLED  vdm
-bash-4.1$ find . -name "datastore"
./ckan/ckanext/datastore
-bash-4.1$ pwd
/applications/ecodp/users/ecodp/ckan/ecportal/src


Bert


2013/9/30 John Glover <john.glover at okfn.org>

> Hi Bert,
>
> a) No ckanext-archiver should not be anywhere on the system, it is not
> needed. Just having it installed shouldn't cause any problems, but if it is
> in the ckan plugins list then it will continue to generate data.
>
> b) In that exact folder you have a datastorer folder? That sounds like
> there may be a problem with the build process, that directory should look
> like this: https://github.com/okfn/ckan/tree/release-v1.8.1-ecportal
>
> c) This list seems correct for release 01.00.00 (but ecportal_homepage was
> not available in release 09).
>
> No, archiver and datastore are not active in release 01.00.00.
>
> Regards,
> John
>
>
> On 30 September 2013 11:21, Bert Van Nuffelen <
> bert.van.nuffelen at tenforce.com> wrote:
>
>> Hi John,
>>
>> some further clarification is required I believe.
>>
>> *) the dump could be one of release 00.08.0x in which these plugins are
>> active.
>> *) If it would be a release 00.09.0x version then we need some CKAN
>> deployment/management clarification.
>>
>>    Lets look at our release 01.00.00 (which setup is close to 00.09.00)
>>      a) it contains in the $USERHOME/ckan/ecportal/src/ckanext-archiver
>> directory
>>      b) inside the $USERHOME/ckan/ecportal/src/ckan directory there is
>> the datastorer directory
>>      c) the ecportal.ini does not contain a reference to datastorer nor
>> archiver in the next parameter:
>>             ckan.plugins = synchronous_search ecportal ecportal_form
>> ecportal_publisher_form ecportal_controller ecportal_multilingual_dataset
>> ecportal_homepage multilingual_group multilingual_tag
>>
>>
>>    So
>>        * is archiver active in release 01.00.00?
>>        * is datastore active in release 01.00.00?
>>
>> kind regards,
>>
>> Bert
>>
>>
>>
>> 2013/9/26 John Glover <john.glover at okfn.org>
>>
>>> Hi Bert,
>>>
>>> No, purge_revision_history and purge_package_extra_revision (spelled
>>> with underscores when accessed through the API) are two separate calls. The
>>> former purges old data from the resource_revision database table, and the
>>> latter removes old data from package_extra_revision. Both tables store old
>>> versions of data relating to datasets, and so both should be run regularly
>>> due to the large number of daily updates.
>>>
>>> Regards,
>>> John
>>>
>>>
>>> On 26 September 2013 17:06, Bert Van Nuffelen <
>>> bert.van.nuffelen at tenforce.com> wrote:
>>>
>>>> Hi John,
>>>>
>>>> thanks for the investigation.
>>>>
>>>> Is the api call
>>>>
>>>> api.purge=http://localhost:8008/data/api/action/purge_revision_history
>>>>
>>>> equal to the purge-package-extra-revision paster command?
>>>>
>>>> kind regards,
>>>>
>>>> Bert
>>>>
>>>>
>>>> 2013/9/26 John Glover <john.glover at okfn.org>
>>>>
>>>>> Hi Bert,
>>>>>
>>>>> I have had a look at the PO database dump. The sizes of the largest
>>>>> tables are as follows:
>>>>>
>>>>> public.kombu_message                           | 15 GB
>>>>> pg_toast.pg_toast_78586                        | 3662 MB
>>>>> public.task_status                             | 1548 MB
>>>>> public.package_extra_revision                  | 811 MB
>>>>> public.task_status_entity_id_task_type_key_key | 670 MB
>>>>> public.task_status_pkey                        | 458 MB
>>>>>
>>>>> Apart from package_extra_revision, the other large tables all refer to
>>>>> tables used by the old CKAN extensions (such as ckanext-archiver,
>>>>> ckanext-datastorer and ckanext-qa). These should not be installed any more.
>>>>>
>>>>> In the previous release, we supplied a paster command to purge the
>>>>> task_status and kombu_message tables:
>>>>>
>>>>> paster --plugin=ckanext-ecportal ecportal purge-task-data -c <config>
>>>>>
>>>>> Running this should get rid of most of this unnecessary data (although
>>>>> you will probably to do a VACUUM in postgres afterwards to reclaim space).
>>>>> This should only have to be run once if the extensions have been correctly
>>>>> uninstalled. If these tables continue to grow, then at least one of the
>>>>> extensions is still installed. So, make sure that none of the following
>>>>> appear in the ckan.plugins config list: archiver, qa, datastorer.
>>>>>
>>>>> From looking at the timestamps of the task_status entries, it would
>>>>> seem at least 1 of these extensions is still installed, as the last write
>>>>> was by the datastorer on the 4th September.
>>>>>
>>>>>  The package revision table is also quite large, but this is expected
>>>>> due to the daily updating of all Eurostat packages. This should be
>>>>> regularly cleared with the purge-package-extra-revision paster command as
>>>>> in previous releases.
>>>>>
>>>>> Regards,
>>>>> John
>>>>>
>>>>> _______________________________________________
>>>>> Ecodp-dev mailing list
>>>>> Ecodp-dev at lists.okfn.org
>>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Bert Van Nuffelen
>>>>
>>>> Semantic Technologies Software Architect at TenForce
>>>> www.tenforce.be
>>>>
>>>> Bert.Van.Nuffelen at tenforce.com
>>>> Office: +32 (0)16 31 48 60
>>>> Mobile:+32 479 06 24 26
>>>> skype: bert.van.nuffelen
>>>>
>>>> _______________________________________________
>>>> Ecodp-dev mailing list
>>>> Ecodp-dev at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Ecodp-dev mailing list
>>> Ecodp-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>>
>>>
>>
>>
>> --
>> Bert Van Nuffelen
>>
>> Semantic Technologies Software Architect at TenForce
>> www.tenforce.be
>>
>> Bert.Van.Nuffelen at tenforce.com
>> Office: +32 (0)16 31 48 60
>> Mobile:+32 479 06 24 26
>> skype: bert.van.nuffelen
>>
>> _______________________________________________
>> Ecodp-dev mailing list
>> Ecodp-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>>
>>
>
> _______________________________________________
> Ecodp-dev mailing list
> Ecodp-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ecodp-dev
>
>


-- 
Bert Van Nuffelen

Semantic Technologies Software Architect at TenForce
www.tenforce.be

Bert.Van.Nuffelen at tenforce.com
Office: +32 (0)16 31 48 60
Mobile:+32 479 06 24 26
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130930/f22f615a/attachment.html>


More information about the ecodp-dev mailing list