[ckan-discuss] Topic list (vocabulary/valuelist) for classifyingdatasets .

Plantinga, Edo Edo.Plantinga at koop.wmrijk.nl
Wed Jun 13 11:13:47 BST 2012


Hi Pascal,

Thank you for your swift response. 
Looking at the GEMET vocabulary, it seems to be limited to geographic
datasets. Naturally, many open datasets are geographic datasets.
However, there are also many datasets outside this domain. How would you
classify the datasets on the EU portal that are now under the current
categories Finance and Budgeting, Education and Communication, Economy
and Industry, Social Questions 
Population & Health? The GEMET vocabulary does not seem to be suitable
for this. 
A subset of the EUROVOC (while still not perfect, looking at the
datasets we have), seems to be a better fit for datasets outside the geo
domain, in my opinion. I can see a fairly big overlap between the
current EU portal topics and the EUROVOC topics, by the way.

These are the EUROVOC categories I am referring to, to avoid confusion.
 04 POLITICS  08 INTERNATIONAL RELATIONS  10 EUROPEAN COMMUNITIES  12
LAW  16 ECONOMICS  20 TRADE  24 FINANCE  28 SOCIAL QUESTIONS  32
EDUCATION AND COMMUNICATIONS  36 SCIENCE  40 BUSINESS AND COMPETITION
44 EMPLOYMENT AND WORKING CONDITIONS  48 TRANSPORT  52 ENVIRONMENT  56
AGRICULTURE, FORESTRY AND FISHERIES  60 AGRI-FOODSTUFFS  64 PRODUCTION,
TECHNOLOGY AND RESEARCH  66 ENERGY  68 INDUSTRY  72 GEOGRAPHY  76
INTERNATIONAL ORGANISATIONS 

The advantage of EUROVOC is that there are subcategories that can be
used for classifying datasets that do not fall into an obvious category.
For example: under what category should a hospital quality dataset fall?
Searching on 'Hospital' on the EUROVOC site gives a search result of
health policies of "MT 2841 HEALTH", so according to EUROVOC this falls
under "28 SOCIAL QUESTIONS". Still not perfect, but hey, no
catagorization system will be.

I'd be interested to hear your take on this.

Best regards,
 
Edo Plantinga - Data.overheid.nl
---------------------------------------------------------------
Bezoekt u binnenkort een locatie van de Rijksoverheid?

Dan dient u in het bezit te zijn van een geldige Rijkspas of een geldig identiteitsbewijs (paspoort, nationale identiteitskaart, rijbewijs of vreemdelingendocument). Indien u bij controle geen geldig identiteitsbewijs kunt tonen, wordt de toegang geweigerd. Legitimatiebewijzen van andere organisaties worden niet geaccepteerd.

Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u niet de geadresseerde bent of dit bericht abusievelijk aan u is toegezonden, wordt u verzocht dat aan de afzender te melden en het bericht te verwijderen. De Staat aanvaardt geen aansprakelijkheid voor schade, van welke aard dan ook, die verband houdt met risico's verbonden aan het elektronisch verzenden van berichten.

This message may contain information that is not intended for you. If you are not the addressee or if this message was sent to you by mistake, you are requested to inform the sender and delete the message. The State accepts no liability for damage of any kind resulting from the risk inherent in the electronic transmission of messages.




More information about the ckan-discuss mailing list