[ckan-dev] Revisioning problems

David Read david.read at hackneyworkshop.com
Mon Mar 10 15:15:25 UTC 2014


Hi all,

We're seeing some of our revision tables getting messed up and wanted
to know if anyone else is seeing this?

It seems that edits to the package occasionally doesn't complete
properly. The current flag doesn't move to the new resource_revision,
or the new resource_revision doesn't have an expire_timestamp of
9999-12-31 like it should. The upshot is that the resource becomes
'invisible' (not returned by package_show) and on the next write it
goes to state deleted. I've included an example below.

It started occurring since upgrading from CKAN 2.0 to 2.2, but we have
plenty of our own extensions that could be involved too -
ckanext-archiver and ckanext-qa both trigger when a dataset is
written, and write to the resource itself too. It appears for both
writes via the form and when we run background tasks at the weekend to
archive every file, using 3 processes in parallel. About 2% of writes
are problematic, and although we have a tool to fix them each time, we
are still looking for the root cause.

So do say if you've seen this sort of behaviour as well. It seems like
a race condition, perhaps because the revision writing turns out to be
non-thread-safe, or maybe we've caused problems with our custom
session extension.

David

An example of the resource_revision having gone wrong:

e.g. select revision_timestamp,expired_timestamp,current from
resource_revision where     id='b2972b35-b6ae-4096-b8cc-40dab3927a71'
order by revision_timestamp;
    revision_timestamp     |     expired_timestamp      | current
---------------------------+----------------------------+---------i
2013-04-13 01:47:30.18897  | 2013-06-18 19:01:45.910899 | f
2013-06-18 19:01:45.910899 | 2014-01-18 08:55:41.443349 | t
2014-01-18 08:55:41.443349 | 2014-01-18 08:55:41.566383 | f



More information about the ckan-dev mailing list