[ckan-dev] ckanext-spatial and pycsw synchronization workflow

Adrià Mercader adria.mercader at okfn.org
Fri Nov 22 12:23:03 UTC 2013


On 22 November 2013 05:48, Tom Kralidis <tomkralidis at gmail.com> wrote:
> On Wed, Nov 20, 2013 at 7:45 PM, Ryan Clark <ryan.clark at azgs.az.gov> wrote:
>> To be clear, is the real goal here ISO-compliant metadata, or is it access to CKAN sites through the CSW API?
>>

Ideally both, although they can happen in different stages. The
feedback we've got from users is that they would be able to create ISO
metadata records from CKAN (lots of portals have some relation to
INSPIRE), so I'd like to see efforts in the CSW front to be based in
ISO if possible. Also all support for spatial harvesting in CKAN is
based around ISO as well.


> Very good question.  Thinking about this more, in GeoNode we took ISO
> as our base model for metadata. In OpenDataCatalog, we used DC. In
> both cases pycsw binds directly against the underlying database,
> stores a full XML document in a column, and the downstream's
> application columns are mapped to pycsw's model by way of a Python
> dict.
One big difference with these catalogs and the thing that makes
difficult to implement a nice solution is that CKAN was not designed
from the start as a catalog for geospatial data with a specific model
in mind. Its model has a set of minimal fields plus arbitrary fields
(extras) with the idea that developers can extend it or adapt it to
their own needs (and people uses all kinds of models). Binding
directly into CKAN's tables would be difficult as the necessary fields
are spread across different tables. I think working at a higher level
with the results returned by the logic layer (basically what you see
on an API call) will make things easier.


> Having said this, Ryan's point clarifies things (a bit, for me at
> least).  I think we should provide DC as the goal of CKAN CSW support
> of local records, the advantage being pycsw runs read-only atop the
> CKAN database.  Harvested records, being in ISO already, can still
> exist and pycsw converts them on the fly to DC if needed.
I understand that this could simplify things, but see above for my
preference for iso. Again, it would be really useful to have a list of
stuff that we would need to generate a valid ISO doc (or DC),
regardless of where would we get it from CKAN, If ISO turned out to be
a huge pain to generate we could use DC and switch to generating ISO
later on, but I'd rather avoid the double work.


> If this is an iteration or two away or needs more thought, we could
> move ahead with the current sync approach and extend it to support
> local records, as a near term (easier) quick win.
>
> Comments?
Sounds great, let's focus on what would we need to generate iso
records from local CKAN records.

Great discussion, thanks

Adrià



More information about the ckan-dev mailing list