[ckan-dev] Thoughts on CKAN development

Haq, Salman Salman.Haq at neustar.biz
Wed Sep 26 15:33:45 UTC 2012


Hi all,

After developing a non-trivial extension for CKAN, I want to share some thoughts with the development team. Some of these may not be news for the dev team but I want to share them anyway. Other issues may have already been fixed in later versions. All of my comments relate to the state of the v1.7.1 core code base. Lastly, I think CKAN is a good project and I want to make it better so please don’t take it as criticism.


 *   Installation and Setup

Installing CKAN on the supported version of Ubuntu is pretty easy (kudos). But beyond that, installing it on another distro can be quite cumbersome. There are contributed instructions for installing on CentOS. These could benefit from being organized better and included and maintained in the official docs. The ckanext-datastore plugin seems to be quite popular but also a cause of many troubleshooting inquiries on the mailing list (although not as much of late). It could use better documentation. Perhaps a fabric or salt script can be made available for source installations?

 *   Interface classes lack documentation (other than function doc strings).

The IDatasetForm and IPackageController interfaces in particular are important examples. Almost every non-trivial extension will need to implement these interfaces but there is insufficient documentation about them. Further, these interfaces have a large number of methods and they seem very complex.

 *   Request data flow is not clear from documentation.

Often, I found myself reading the core CKAN code to figure out when a certain interface method gets called, (eg: IDatasetForm.setup_template_variables). Manually tracing the call graph gets very aggravating quickly.

 *   Interaction between forms and models is complex.

There are several dictize and dict save methods in the core that could use better documentation. The VDM stuff is quite cool and useful but it feels like magic when inspecting a model file and gives me shivers. Modifying the package package form even the slightest requires almost a full re-implementation of several package templates - case in point, the Organization plugin. A combination of better documentation and simpler logic will be helpful.

 *   'context' dictionary is used in several places but it is not documented.

Several controller methods construct an arbitrary 'context' dictionary and pass it around. There is no documentation about which attributes are expected or optional. It also smells of code repetition because almost every controller action method creates one of these.

 *   Too many levels of indirection.

This impeded me the most when I would try to figure out which template was being used to render a specific page. I would look up the appropriate action method of the appropriate controller only to find that the final 'return' statement was invoking another function that was performing a lookup to determine the template path. I understand that one of the goals of CKAN is to be a modular, extensible framework and some level of indirection is unavoidable but sometimes it feels like there is too much of it going on.

 *   Cruft.

The package_formalchemy and auth vs new_authz modules can be really confusing. I'm sure there are other examples of such things which can stand to be removed from the core once and for all.

 *   Relational Modeling

CKAN makes dataset information available as rdf+xml in addition to html. Quite often, people on the mailing list ask about the SPARQL interface. However, the actual data is stored in a relational postgres instance and we have to jump through some hoops to provide the semantic interfaces. This makes me wonder, why not just use an RDF database like Sesame for storage and do away with the relational model altogether?

These are just my experiences as a first time CKAN developer. With the exception of the point about relational modeling, I would summarize them as the numerous factors which makes CKAN learning curve higher than necessary resulting in developer impedance.

Thanks,
Salman


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20120926/348398b0/attachment-0001.html>


More information about the ckan-dev mailing list