[ckan-discuss] CKAN - Google Refine integration

Brand Niemann bniemann at cox.net
Tue Apr 26 13:58:41 BST 2011


I would like to see these Excel files converted to RDF -
http://semanticommunity.net/StatAbs2011/

They are the best US data and also have metadata embedded in the Excel files
as well.

Brand

-----Original Message-----
From: ckan-discuss-bounces at lists.okfn.org
[mailto:ckan-discuss-bounces at lists.okfn.org] On Behalf Of Monika Solanki
Sent: Tuesday, April 26, 2011 8:25 AM
To: ckan-discuss at lists.okfn.org
Subject: Re: [ckan-discuss] CKAN - Google Refine integration

Hi Fadi,

An important part of the last step IMHO is the provision of voID files for
the new RDF datasets, I assume the datasets will become a part of he LOD
cloud at some point, so there is a mileage in having their voID descriptions
besides the package metadata provided by CKAN at http://semantic.ckan.net.

Would the dataset providers be notified of their RDF datasets once they
become available? It would be worth pointing them to the voID editor. I
assume it may be difficult to fill up values for many of the voiD attributes
automatically from their CVS/Excel representations.

Monika

On 26/04/11 11:50, Maali, Fadi wrote:
> Hi all,
>
> This has been discussed here before and was also discussed in the last 
> CKAN online community meetup.
>
> I will describe here a scenario Richard Cyganiak and I are working on.
> Our goal is to help publishing datasets currently available in CSV or 
> Excel format as Linked Data.
>
> 1. navigate the packages available in a CKAN catalogue from within 
> Google Refine. This is currently implemented as an extension to Google 
> Refine and use the RDF representation of CKAN catalogues (as the ones 
> available at http://semantic.ckan.net). Any package that has a 
> resource understandable by Google Refine (a.k.a CSV, Excel, TSV...) 
> can be opened as a Google Refine project.
> 2. Google Refine is used to conduct any data cleaning and 
> transformation required.
> 3. using the "RDF Extension for Google Refine" (available at:
> http://lablab.linkeddata.deri.ie/2010/grefine-rdf-extension/ ) the 
> data can be exported as RDF 4. The result RDF data is saved back to 
> CKAN and linked to the respective package.
>
> It is the last step actually that is still missing some details and 
> requires discussion. Our tentative ideas about it:
> - result data is saved to storage.ckan.net (we need help from CKAN 
> guys
> here)
> - the result data is considered a new resource of the existing package.
> This is automatically registered through the CKAN API.
> - along with the RDF data we save the JSON representation of all 
> Google Refine operations that have been applied to the original data 
> i.e. any one starting with the CSV file on CKAN can re-apply the 
> operations using the JSON representation in Google Refine to get an 
> exact copy of the RDF data
>
> Does that look reasonable? Any feedback?
>
> Regards,
> Fadi
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss


_______________________________________________
ckan-discuss mailing list
ckan-discuss at lists.okfn.org
http://lists.okfn.org/mailman/listinfo/ckan-discuss





More information about the ckan-discuss mailing list