[ckan-discuss] CKAN - Google Refine integration

Monika Solanki monika.solanki at gmail.com
Tue Apr 26 13:24:45 BST 2011


Hi Fadi,

An important part of the last step IMHO is the provision of voID files 
for the new RDF datasets, I assume the datasets will become a part of he 
LOD cloud at some point, so there is a mileage in having their voID 
descriptions besides the package metadata provided by CKAN at 
http://semantic.ckan.net.

Would the dataset providers be notified of their RDF datasets once they 
become available? It would be worth pointing them to the voID editor. I 
assume it may be difficult to fill up values for many of the voiD 
attributes automatically from their CVS/Excel representations.

Monika

On 26/04/11 11:50, Maali, Fadi wrote:
> Hi all,
>
> This has been discussed here before and was also discussed in the last
> CKAN online community meetup.
>
> I will describe here a scenario Richard Cyganiak and I are working on.
> Our goal is to help publishing datasets currently available in CSV or
> Excel format as Linked Data.
>
> 1. navigate the packages available in a CKAN catalogue from within
> Google Refine. This is currently implemented as an extension to Google
> Refine and use the RDF representation of CKAN catalogues (as the ones
> available at http://semantic.ckan.net). Any package that has a resource
> understandable by Google Refine (a.k.a CSV, Excel, TSV...) can be opened
> as a Google Refine project.
> 2. Google Refine is used to conduct any data cleaning and transformation
> required.
> 3. using the "RDF Extension for Google Refine" (available at:
> http://lablab.linkeddata.deri.ie/2010/grefine-rdf-extension/ ) the data
> can be exported as RDF
> 4. The result RDF data is saved back to CKAN and linked to the
> respective package.
>
> It is the last step actually that is still missing some details and
> requires discussion. Our tentative ideas about it:
> - result data is saved to storage.ckan.net (we need help from CKAN guys
> here)
> - the result data is considered a new resource of the existing package.
> This is automatically registered through the CKAN API.
> - along with the RDF data we save the JSON representation of all Google
> Refine operations that have been applied to the original data i.e. any
> one starting with the CSV file on CKAN can re-apply the operations using
> the JSON representation in Google Refine to get an exact copy of the RDF
> data
>
> Does that look reasonable? Any feedback?
>
> Regards,
> Fadi
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss




More information about the ckan-discuss mailing list