[ckan-dev] ckanext-inventory

Alex Palcuie alex.palcuie at gmail.com
Mon Feb 22 20:38:13 UTC 2016


Hello,

My name is Alex Palcuie and I have been maintaining the open data
portal[0][1] from Romania since April last year. I like to share wth
you an extension that I have been building in the last months and how
developing on CKAN felt for me.

One of the use cases that appeared after a law was adopted in my
country was that every governmental institution has to publish its
inventory of datasets. This means every organization (and we have over
8k) has to identify a list of its possible datasets and publish it on
data.gov.ro.

My background experience with Python has been doing backend work for a
startup with Django for about 6 months, before moving to do frontend.
I've also done some Rails work.

An advice would be to make the default docs[2] page for CKAN to be the
last production version (as it is on Django[3]) and not the latest
branch. This I remember was a little confusing at first.

The first task I wanted to do was to add a unique ID to each
organization and make new users that are registering in CKAN, to put
this unique ID and have them be confirmed in a panel by an admin.
After they are confirmed, a reset password email has to be sent to the
user and he will be automatically added to the organization[8].

After I made a list of the best CKAN extensions[4], I found out that a
pattern is to copy paste the original code and slightly modify it[5].
It didn't feel good to me, and I realized afterward that hooks (or
signals in Django's terms) for the user controller were missing. When
I'll start refactoring that part, I'll commit upstream the
modifications needed. The Javascript part of making the API request
was well documented and it worked seamlessly.

The next flow I implemented was that before users can add a new
dataset, they must add an inventory entry for it and a recurring
interval[9][10]. For example, an organization could have the inventory
entry Budget with a recurring interval of 365 days. Now, their
datasets will be Budget 2015, Budget 2016 and so on, and specifying
that recurring interval means that we can monitor them to see if they
have updated the dataset in the time they have assumed and ping them
with automatic emails if they haven't.

Fiddling with making a migration was a real pain, but the new
IMigration[6] interface looks promising.

One thing that I did not manage was to use the IGroupForm and
IDatasetForm interfaces in one single extension. I had to make a
hack[7] with 2 classes, because some methods, like `is_fallback`,
overlapped and the plugin setup would now know which method to use.

However, I think it's mostly my fault because I've developed the
extension as a big monolith, without thinking too much on planning. My
plan is after I release this into production this month, to split up
the small features I can into smaller extensions, write tests for them
and release them on PIP, with a follow-up email on this list.

All in all, I had lots of fun developing this extension that I think
will become useful for us. I want to thank you all for making CKAN a
nice platform to work on and making a great community.

Alex

[0]: http://data.gov.ro
[1]: https://github.com/govro
[2]: http://docs.ckan.org
[3]: https://docs.djangoproject.com/en/1.9/
[4]: https://github.com/govro/ckanext-inventory/blob/master/docs/best_extensions.md
[5]: https://github.com/govro/ckanext-inventory/blob/master/ckanext/inventory/controllers/user.py
[6]: https://github.com/ckan/ckan/search?q=table&type=Issues&utf8=✓
[7]: https://github.com/govro/ckanext-inventory/blob/master/ckanext/inventory/plugin.py
[8]: https://i.imgur.com/j2GKMbq.png
[9]: https://i.imgur.com/sSce4Nm.png
[10]: https://i.imgur.com/34NAd9B.png



More information about the ckan-dev mailing list