[ckan-dev] ckanext-spatial and pycsw synchronization workflow

Tom Kralidis tomkralidis at gmail.com
Fri Nov 22 05:38:17 UTC 2013


On Wed, Nov 20, 2013 at 7:32 PM, Ryan Clark <ryan.clark at azgs.az.gov> wrote:
> Tom -
>
> My approach does require a sync step, where a CKAN object is read from the database, passed through the Jinja template and the result is fed into a pycsw table. That pycsw table is in the same CKAN database, fwiw, but the table that pycsw reads from is not core-CKAN.
>

Ah, ok.

> An extension like ckanext-spatial can be configured to run that synchronization every time a dataset is updated, which makes the integration pretty seamless, even though its not quite as ideal as I think you're imagining.
>

I think a logical next step might be to support local datasets via
synchronization, which at least gets us closer to the ultimate goal?
The question then becomes how do we sync local records? Convert a JSON
to XML or Ryan's approach of reading from the CKAN database?

Does ckan or ckanext-spatial for that matter allow for extra fields in
the dataset model? In the case of pycsw reading straight off the CKAN
db, pycsw would need a few more physical columns.

> One case to keep in mind: When records are harvested, the ideal CSW implementation will turn those harvested XML records around and provide them as GetRecordByID responses completely unedited. If the XML record is converted to a CKAN package during harvest, then that CKAN package is re-converted to XML on CSW request, there is pretty much no chance that the document will be the exact same as what was harvested.

For this, pycsw has a text column ('xml') which does exactly this for
GetRecords or GetRecordById responses when elementsetname=full (as an
early out).

>In that case, the approach CKAN currently takes is actually ideal. Then that leads you to, what do you do if a harvested record is edited in the CKAN interface?
>

Does CKAN allow for edited harvested records? What happens when they
are reharvested from upstream?

> The default criteria for a CKAN package is not sufficient to generate a valid ISO metadata record. I just generated this package: http://demo.ckan.org/dataset/minimum-content. I was required to enter two pieces of information: a title and a url. Here's the JSON serialization of that content: https://gist.github.com/rclark/7573839.
>
> Thanks for keying me in here -- sorry that this conversation somehow slipped under my radar.
>
> Ryan



More information about the ckan-dev mailing list