[okfn-discuss] How to add metadata to data sets on the portal ?(Request for cooperation)

Masahiko SHOJI Shoji at glocom.ac.jp
Mon Aug 8 10:04:51 UTC 2016


Dear Bart and Christian,

Thank you for your useful information.  I will report them based on
your information.

Japanese government seems to be thinking about to use services that
extract some semantic info from newly opend data and documents to make
metadata, though I do not know much.  It might be a kind of advanced
harvesting. In any case, they are very motivated.

Thank you again.

Masa SHOJI
Open Knowledge Japan
shoji at glocom.ac.jp


2016-08-03 16:42 GMT+09:00 Christian Ledermann <christian.ledermann at gmail.com>:
> On 1 August 2016 at 23:02, Bart Hanssens <Bart.Hanssens at fedict.be> wrote:
>> Hi,
>>
>> Well, the pan-EU data portals doe use extra services, although technically they are not part of the crawler.
>> A machine translator, for instance, for translating metadata titles and short descriptions.
>> And maybe a geo-coding service for mapping regions to geo-coordinates
>>
>> But they are based upon some metadata that is already available (title in another language, or name of region)
>>
>> If your data would be stored in a CMS, the crawler can probably harvest some trivial metadata (update time, name...)
>> File format can be derived from file extension.
>>
>>
>> There are also services that try to extract semantic info from (meta)data,  but I don't think they are used by open data portals.
>> They are more targeted to news items, documents etc. For instance http://www.opencalais.com/about-open-calais/
>>
>
> a oss alternative is stanbol: https://stanbol.apache.org/
>
>> I assume that, if you would have enough datasets that are already labeled correctly (category / theme from a limited list),
>> it would be possible for a crawler to automatically classify new datasets (using a machine learning library)
>>
>>
>> Best regards
>>
>> Bart
>>
>> -----Original Message-----
>> From: okfn-discuss [mailto:okfn-discuss-bounces at lists.okfn.org] On Behalf Of Masahiko SHOJI
>> Sent: Monday 1 August 2016 10:59
>> To: Open Knowledge Foundation discussion list <okfn-discuss at lists.okfn.org>
>> Subject: Re: [okfn-discuss] How to add metadata to data sets on the portal ?(Request for cooperation)
>>
>> Hi Bart,
>>
>> Thank you for your kind reply.  I should clarify my question, but your comment is very useful for me.
>>
>> Japanese government wants to know efficient ways for officials of various  ministries in charge of registering new data sets as their daily work.  They seems to feel burden to adding metadata manually.
>>
>> I have heard that some countries may be using crawler which automatically adds metadata. I do not know about what kind of metadata is.  Do you know about such information?
>>
>>
>> Best regards,
>>
>> Masa SHOJI
>> Representative Director
>> Open Knowledge Japan
>>
>>
>>
>>
>> 2016-07-29 5:47 GMT+09:00 Bart Hanssens <Bart.Hanssens at fedict.be>:
>>> Hi,
>>>
>>> It probably depends on what you mean by adding metadata on the portal, and how the portal is maintained.
>>> Is this about adding extra metadata, after the datasets are published on the portal ?
>>> Are the datasets on the Japanese portal maintained manually, or pushed to the portal by an automated process ?
>>>
>>> E.g. for data.gov.be, the national portal in Belgium, gets it
>>> (meta)data from various other (regional) portals, and from different websites, by scraping (HTML sites) or using an API (CKAN, OpenDataSoft or other software).
>>> The metadata from all portals is transformed to a DCAT-AP file, that is used to update the data.gov.be site.
>>>
>>> We don't add metadata anymore on the data.gov.be itself, everything is
>>> done before it gets uploaded
>>>
>>> Sometimes extra metadata is added (e.g. mapping of free text keywords
>>> to themes/categories, or "missing" language tags, or a default email
>>> contact...), or data is corrected etc
>>>
>>> Everything is command line (basically a small Java program that runs
>>> various SPARQL files), and some mapping files are to be created
>>> manually (typically a SKOS file)
>>>
>>> It's not "pretty", and not very advanced, but it works for our purposes...
>>>
>>> See https://github.com/Fedict/dcattools
>>>
>>>
>>> Best regards,
>>>
>>> Bart Hanssens
>>> Interoperability expert
>>> Federal Public Service ICT Belgium
>>>
>>> -----Original Message-----
>>> From: okfn-discuss [mailto:okfn-discuss-bounces at lists.okfn.org] On
>>> Behalf Of Masahiko SHOJI
>>> Sent: Thursday 28 July 2016 11:49
>>> To: Open Knowledge Foundation discussion list
>>> <okfn-discuss at lists.okfn.org>
>>> Subject: [okfn-discuss] How to add metadata to data sets on the portal
>>> ?(Request for cooperation)
>>>
>>> Hi all,
>>>
>>> Japanese government is asking me about efficient way to add metadata
>>> to enormous data sets on the government open data portal site.   I
>>> would appreciate it if you could cooperate my question.
>>>
>>> 1. How do government officials add metadata to each data sets on the government data portal site like "data.gov" ?
>>>
>>> 2. Do the government have any tools ( crawler ...etc) to add metadata?
>>> or have any plans to develop them?
>>>
>>> 3. Who knows this issue?  What department is in charge of this?
>>>
>>> 4. Do you have any related information or any outlook about this issue?
>>>
>>> Thank you.
>>>
>>> Masa Shoji
>>> Representative Director
>>> Open Knowledge Japan
>>> _______________________________________________
>>> okfn-discuss mailing list
>>> okfn-discuss at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/okfn-discuss
>>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-discuss
>>> _______________________________________________
>>> okfn-discuss mailing list
>>> okfn-discuss at lists.okfn.org
>>> https://lists.okfn.org/mailman/listinfo/okfn-discuss
>>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-discuss
>> _______________________________________________
>> okfn-discuss mailing list
>> okfn-discuss at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/okfn-discuss
>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-discuss
>> _______________________________________________
>> okfn-discuss mailing list
>> okfn-discuss at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/okfn-discuss
>> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-discuss
>
>
>
> --
> Best Regards,
>
> Christian Ledermann
>
> Newark-on-Trent - UK
> Mobile : +44 7474997517
>
> https://uk.linkedin.com/in/christianledermann
> https://github.com/cleder/
>
>
> <*)))>{
>
> If you save the living environment, the biodiversity that we have left,
> you will also automatically save the physical environment, too. But If
> you only save the physical environment, you will ultimately lose both.
>
> 1) Don’t drive species to extinction
>
> 2) Don’t destroy a habitat that species rely on.
>
> 3) Don’t change the climate in ways that will result in the above.
>
> }<(((*>
> _______________________________________________
> okfn-discuss mailing list
> okfn-discuss at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-discuss
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-discuss



More information about the okfn-discuss mailing list