[open-geodata] Geodata discovery/reusability

Jonathan Gray jonathan.gray at okfn.org
Tue Mar 2 13:35:07 UTC 2010


Hi all,

Quick email to kickstart discussion around geodata discovery and
packaging. In particular focused on how we can build on CKAN to
support this!

Gobe, Mike, Suchith: I seem to recall that we discussed the
possibility of CGS putting together some brief notes on CKAN user
experience and ideas for improvements to support geodata. What do you
think? Also would be interested to hear if you have any Python
developers at CGS that might be willing to look at the code:

  http://knowledgeforge.net/ckan/trac

Jo Walsh at EDINA has started writing about this in an email (copied
below) and at:

  http://wiki.osgeo.org/wiki/Location_in_CKAN
  http://docs.google.com/Doc?docid=0ATJnv_t9ROmXZGN0Yjk3ampfMjJkemtxaGdjNQ&hl=en&pli=1

Any thoughts and suggestions most welcome!

All the best,

Jonathan

---------- Forwarded message ----------
From: Jo Walsh <jo at frot.org>
Date: Mon, Mar 1, 2010 at 4:37 PM
Subject: Re: [ckan-discuss] Fwd: JISC Grant Funding 14/09: Managing
Research Data Programme
To: ckan-discuss at lists.okfn.org


dear all,

On 01/03/2010 14:21, Jonathan Gray wrote:
>
> I wonder if CKAN is a good fit for "Strand A: Citing, Linking and
> Integrating Research Data"?
>
> Does anyone have any thoughts? In particular about using CKAN for
> working with data in different domains?

I've been thinking about the geographic application of CKAN in this
context. The benefits of annotating locations for packages in general;
the value of adding GIS-specific metadata using plugins; the creation
of an OpenSearch interface to do location-based queries.

Jonny and I discussed with several at the Centre for Geospatial
Science in Nottingham the possibility of cooperating on a proposal.
The CFP requires an English or Welsh academic institution to lead the
proposal - perhaps EDINA isn't eligible as a lead partner - though
colleagues here are happy with the idea of participating too.

A consortium is welcome but involves more upfront planning to ensure
its viability as a partnership. Thus I would like to put something
together that created shared interfaces on several repository/registry
systems.

http://docs.google.com/Doc?docid=0ATJnv_t9ROmXZGN0Yjk3ampfMjJkemtxaGdjNQ&hl=en
- very rough notes here, at the top a short excerpt covering the
purpose of the call and key criteria for Strand A as defined by JISC -
also included here:

     JISC wishes to fund projects ... to demonstrate the innovative
potential for research and scholarly communications of improving
methods for citing, linking and integrating research data. It is
intended that such projects should contribute to the realisation of an
integrated and interoperable data environment, encourage innovative
techniques which promote the reuse of research data and demonstrate
the benefits which may follow from such an environment.

     JISC is intending to release another call for proposals in March
2010, focusing on 'Innovative ways to deposit and expose digital
resources'. This will cover some of the same ground as Strand A of the
present Call. However the differences are as follows: Strand A of the
present Call is specifically concerned with exploring methods for
citing, integrating and linking research data. The forthcoming
‘Deposit and Expose’ Call will cover research outputs, including data,
and other types of digital content such as images, timebased media
etc.

...

Some disciplines have been revolutionised by the community’s adoption
of open data principles. The innovative and transforming potential of
data reuse, recombinations or ‘mashups’ – for example those combining
data with geospatial location components – is a growing source of
academic interest and is generating palpable excitement both within
and beyond the academy.

It is intended that such projects should contribute to the realisation
of an integrated and interoperable data environment, encourage
innovative techniques which promote the reuse of research data and
demonstrate the benefits which may follow from such an environment.

Within [Strand A], projects are sought which will:

        1. facilitate publication of open access research datasets
accompanied by appropriate ontology-based metadata and licenses.
        2. examine optimal conventions for research data citation,
exploring and demonstrating the benefits of particular forms of
citation.
        3. demonstrate approaches to, and explore the benefits of,
integrating heterogeneous data across distributed sources.
        4. examine methods of recording provenance information, not
only for the datasets but also for their citations and links.
        5. explore and scope specific challenges and demonstrate the
benefits to research which may accrue from the bidirectional linking
of research data to other research data, to publications, or to
people. Much of the scope here is laid out in the Coles & Frey
position paper ‘The Relevance of Linking’:
http://ie-repository.jisc.ac.uk/419/

Issues related to citing, integrating and linking research data that
may be addressed include, but are not limited to:

        1. Development of annotation services that ease the task of
metadata assignment to research datasets, thus facilitating their
publication.
        2. Understanding the services (e.g. discovery, citation, link
management) that are needed and those that can be expected to grow
around linking research data.
        3. The required conventions for data citation (exploring and
demonstrating the benefits of particular forms of citation).
        4. The challenges associated with persistent identifiers for
data, especially those associated with versioning and granularity:
i.e. what works for different types of research data?
        5. The role of ontologies, schemas, representation
information, contextual and calibration metadata and/or other
documentation in providing sufficient information for the proper
re-use of research data.
        6. The role of the Semantic Web / ‘Linked Data’ approach, the
use of RDF and related standards such as RDFa, SKOS, FOAF and SPARQL,
to enhance reuse and repurposing of research data.
        7. Exploration and application of the recommendations made in
the Cabinet Office’s Report on ‘Designing URI Sets for the UK Public
Sector’ for scholarly research data.
        8. Integration of heterogeneous data using a cluster of
technologies including RDF, relational to RDF mappings, SPARQL
services etc.
        9. The requirement for tools to process and analyse links and
the interaction with linked resources.
       10. Exploration of methods for assigning provenance information
to citations and links.
       11. Protocols for automated linking, and the potential roles of
OAI-ORE, ATOM Publishing Protocol, RSS, aggregators and syndication.
       12. The effect of database architecture, storage mechanisms
etc. on the linking process.

1 E.g. see the Danno annotation service and associated tools being
created by eResearch, University of Queensland,
http://www.itee.uq.edu.au/~eresearch/projects/diasb/index.php

If the bid is from a consortium:

i) have the partners provided evidence of their commitment in the form
of supporting letters?
ii) have the partners demonstrated how the work aligns with their
objectives and priorities?
iii) is it clear what the role of each partner is and how the actual
or planned management structure, governance, decision-making and
funding arrangements will function?

cheers,


jo
--

_______________________________________________
ckan-discuss mailing list
ckan-discuss at lists.okfn.org
http://lists.okfn.org/mailman/listinfo/ckan-discuss



-- 
Jonathan Gray

Community Coordinator
The Open Knowledge Foundation
http://blog.okfn.org

http://twitter.com/jwyg
http://identi.ca/jwyg




More information about the open-geodata mailing list