[ckan-dev] ANN: ckanapi-3.0

Ian Ward ian at excess.org
Mon Feb 10 14:31:01 UTC 2014


ckanapi has grown a command line interface!

  https://github.com/ckan/ckanapi

The ckanapi command line interface(CLI) lets you access local and
remote CKAN instances for bulk operations and simple API actions.


Simple actions with string parameters may be called directly. The
response is pretty-printed to STDOUT. e.g.:

  $ ckanapi action group_list -r http://demo.ckan.org
  [
    "data-expolorer",
    "example-group",
    "geo-examples",
    ...
  ]

Local CKAN actions may be run by specifying the config file with -c.
If no remote server or config file is specified the CLI will look for
a development.ini file in the current directory, much like paster
commands. When connecting to a local CKAN instance the site user
(sysadmin) is used by default.


Bulk operations

Datasets, groups and organizations may be dumped to JSON lines text
files and created or updated from JSON lines text files. These bulk
dumping and loading jobs can be run in parallel with multiple worker
processes. The jobs in progress, the rate of job completion and any
individual errors are shown on STDERR while the jobs run.

E.g. load datasets from a dataset dump file with 3 processes in parallel:

  $ ckanapi load datasets -I datasets.jsonl.gz -z -p 3 -c
/etc/ckan/production.ini

Bulk loading jobs may be resumed from the last completed record or
split across multiple servers by specifying record start and max
values.


Shell pipelines

Simple shell pipelines are possible with the CLI. E.g. update the
title of a dataset with the help of the 'jq' command-line json tool
(once we add a package_update_partial action this example won't
require a pipeline):

  $ ckanapi action package_show id=my-dataset \
     | jq '.+{"title":"New title"}' \
     | ckanapi action package_update -i

E.g. Copy all datasets from one CKAN instance to another:

  $ ckanapi dump datasets --all -q -r http://sourceckan.example.com |
ckanapi load datasets


Documentation

ckanapi documentation is a little thin at the moment: just a readme
and docstrings. You may need to refer to the source if you're looking
for more information. For this release I've concentrated on building
lots of unit tests to make sure everything runs properly with all
supported python versions. Contributions are encouraged!


This CLI work is based on commands that were built for the Government
of Canada's open data portal at http://data.gc.ca/

Ian



More information about the ckan-dev mailing list