[ckan4rdm] Short introduction of project EDaWaX

Wed Apr 24 08:31:54 UTC 2013

--On 2013-04-23 10:50, Ross Jones wrote:
> The CKAN default metadata schema is deliberately simple as CKAN
> has support for customising the metadata schema and it is
> expected that people would use that to get exactly what they
> want. The schema we use for data.gov.uk is different than that
> used by the default CKAN install, and is different again from
> say, the EC Open Data Portal ( http://open-data.europa.eu/ ).
> Apologies if you already know this, it wasn't clear.

Yes, I know this. That's what I'm working on for our metadata
schema. And I really appreciate the approach of keeping the base
schema as simple as possible. On the other hand, though, if we
want CKAN to be used by research institutions and journals we
must provide more metadata capabilities. Just compare CKAN with
Dataverse, for example, regarding the metadata features.
Dataverse has even the possibility for users to create their own
metadata template, based on DDI schema.

Enhanced metadata features would not only include fields for
input but also views for output (like e.g. for RDF).  We cannot
expect, that every library interested in using CKAN has developer
resources to implement metadata schemas or other extensions. So,
what we need to do is, in my view, develop extensions that
provide fields for the common metadata schemas (Datacite, DDI
etc.) and share them via github or whatever (or even better make
them part of the CKAN distribution, like ckanext-stats). And, in
addition to that, CKAN should implement at least two or three
more basic metadata fields, that are used by any schema. That
could be e.g. the Persistent Identifier (DOI, Handle) or the year
of publication.  From a research data perspective those metadata
fields are essential.

> This is normally done via the CKAN extension mechanisms
> (IDatasetForm to be exact), and is even easier in CKAN 2 than
> it was in CKAN 1. It is also possible to 'type' datasets as
> well, each having a custom schema for the metadata if required.

Confirmed. We will use the DatasetForm Interface for our metadata
schema, and it is really quite easy to implement. I've started to
work on this last week and have already a working system with the
basic da|ra schema running (although I have some issues with
validation, but that's another story).

> Again, probably depends on exactly how everyone sees curation
> working, but curated datasets by trusted parties is something
> that will be appearing in data.gov.uk this year, and is likely
> to be released on github with our other extensions
> (https://github.com/datagovuk/).  It probably won't do exactly
> what is required, but I'm sure will make a good starting point.
> As it is possible that it will be me doing the work, I'm open
> to suggestions on what would be useful to the RDM community as
> well.

Could you outline the basic idea of this and the general approach
or architecture? That is, what is your idea of curation in the
context of data.gov.uk?

thanks and regards
hendrik

-- 
Dr. Hendrik Bunke
ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften
--Innovative Informations- und Publikationstechnologien--
Tel.: +49 40 42834 454 (Hamburg) OR +49 421 7940430 (homeoffice)
http://zbw.eu