[ckan4rdm] Short introduction of project EDaWaX

Wed Apr 24 08:58:50 UTC 2013

On 24 Apr 2013, at 09:31, Hendrik Bunke wrote:
> Enhanced metadata features would not only include fields for
> input but also views for output (like e.g. for RDF).  We cannot

Just to save you a *little* time on the RDF serialisation, I added 
a basic implementation of DCAT a while back (when I was allowed to 
push to core ;) ) that should be available via conneg and linked in
the HTML, but a different serialisation per schema might be a really
nice feature to implement. Providing VoID info is on my rather long
'todo' list.

> So, what we need to do is, in my view, develop extensions that
> provide fields for the common metadata schemas (Datacite, DDI
> etc.) and share them via github or whatever (or even better make
> them part of the CKAN distribution, like ckanext-stats). And, in
> addition to that, CKAN should implement at least two or three
> more basic metadata fields, that are used by any schema. That
> could be e.g. the Persistent Identifier (DOI, Handle) or the year
> of publication.  From a research data perspective those metadata
> fields are essential.

I am 100% in agreement, I think starting a ckanext-rdm extension to 
hold these things is a great idea.  The doi could be added to a new 
base schema in the extension until we can persuade the CKAN team to 
add it as a core field.  This would help us get over the gap between
now and when a version of CKAN is released with doi built-in. I'm not
sure what the current position is on shipping plugins with core, as you
point out there are a few that are already shipped that way, but I suspect
it would only happen after ckanext-rdm had seen some real-world use.

> Could you outline the basic idea of this and the general approach
> or architecture? That is, what is your idea of curation in the
> context of data.gov.uk?

Currently DGU provides authorisation using groups, although we call 
them publishers. This is now the default model in 2.0 but we still run
an older version. Membership of that group is the thing that gives you
access to manage datasets within it.  Although currently most datasets
are grouped within those publishers I believe the plan is for controlled
sets of users who are not necessarily publishers to be able to curate 
datasets into themed collections.  

Beyond that I don't yet have a large amount of detail in how it should 
be implemented or exactly what is required, which I think this could be
the perfect time to get an idea of how everybody thinks curation should, 
or could, work.  Perhaps a new mail thread just to discuss what people 
would like from data curation?

I should point out that I'm not a researcher, I'm just extremely interested
in the problems that researchers encounter in managing data and like solving 
problems.

Ross