[ckan-dev] How to reduce the size of the database "activity" table?

Adrià Mercader adria.mercader at okfn.org
Wed Nov 14 09:44:34 UTC 2018


Hi all,

@Brian
> Is there a way to safely clear the "activity" table?

As Ian says it should be safe to clear rows from "activity", though you'll
have the delete rows from "activity_detail" first. Do a backup first!

> Or maybe keep it from growing in the first place, since the changes being
recorded are not that interesting in the first place?

There is an option to hide the activities from certain users in the UI (
https://docs.ckan.org/en/latest/maintaining/configuration.html#ckan-hide-activity-from-users).
By default it hides the ones from the site user which is the one used on
the harvesting, so a second option to avoid creating activity records
entirely from certain users makes a lot of sense and it should be
relatively easy to implement.

@Ian
> An option to disable activities entirely would be great for some sites
too.

That exists:
https://docs.ckan.org/en/latest/maintaining/configuration.html#ckan-activity-streams-enabled

@Dan
> Also I think revisions slow down and could have the same size of rows.
Any recommendations?

In principle revisions are also safe to delete as long as you keep the
latest one (marked with current=true). These are a bit more tricky though
as they have relationships with other tables, but as long as you start from
the bottom and then delete the records on the main "revision" table it
should be fine.


Hope this helps,

Adrià








> Regards,
>
> *Dan Mihaila*, IT Consultant
> (M) +40 722 502 304 • (GTalk) dan.mihaila at gmail.com • (Skype) carcotelul
> • (Twitter) dan_mihaila
>
>
> On Tue, Nov 13, 2018 at 10:50 PM Ian Ward <ian at excess.org> wrote:
>
>> We should have a CLI command and background job for removing old activity
>> entries, or collapsing them to 1/day/week/month per dataset.
>>
>> An option to disable activities entirely would be great for some sites
>> too.
>>
>> Until we have features like that it should be safe to remove old activity
>> records directly from the database. I recommend backing up your DB before
>> any direct modifications, of course.
>>
>> On Tue, Nov 13, 2018 at 2:39 PM Brian Bonnlander <bonnland at ucar.edu>
>> wrote:
>>
>>>
>>> Hi CKAN developers,
>>>
>>> We have a CKAN instance with datasets whose metadata are continually
>>> being updated using the CKAN harvester.
>>>
>>> Only one field is typically changed:  the time range for the data (more
>>> data gets added every few days).
>>>
>>>
>>> After running for more than a year, we now have an "activity" table with
>>> over 200,000 rows, and the database has grown to around 10 GB in size.
>>>
>>> Is there a way to safely clear the "activity" table?   Or maybe keep it
>>> from growing in the first place, since the changes being recorded are
>>> not that interesting in the first place?
>>>
>>>
>>> Thank you for your time,
>>>
>>> --Brian
>>>
>>>
>>> -------
>>>
>>> Brian Bonnlander
>>>
>>> National Center for Atmospheric Research
>>>
>>> Boulder, Colorado  USA
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20181114/a8feb18e/attachment-0002.html>


More information about the ckan-dev mailing list