[ckan-discuss] Topic list (vocabulary/valuelist) for classifyingdatasets .
p.romain at cg33.fr
p.romain at cg33.fr
Wed Jun 13 11:34:05 BST 2012
Hi Edo,
I totally agree with you. Eurovoc is a more suitable vocabulary but what
matters is the possible alignement those vocabularies offer.
For example the term gezondheidszorg
http://www.eionet.europa.eu/gemet/concept?cp=3866&langcode=nl&ns=1 refers
to an exact match in Eurovoc, Agrovoc and to a close match to dbpedia.
No one today would use the eurovoc 72 geography category to describe an
opendata dataset including geolocalised data and the issue we are facing
in my opinion is the number of thematic entries we offer to the end user :
too many and we lose them, too little and we might confuse them.
As a response to this issue we could implement a select dropdown that
offers the subset to the end user but offer him the ability to describe
its dataset with the top-level of this vocabulary
That's the way we currently go with our work on Ckan. We could probably
share some code in that matter, couldn't we ?
Best,
Pascal Romain
Chef de projet informatique documentaire
Service Projets Etudes Conseils
Direction des Systèmes d'Information
05 56 99 33 33 poste 6643
@datalocale
De : "Plantinga, Edo" <Edo.Plantinga at koop.wmrijk.nl>
A : <ckan-discuss at lists.okfn.org>
Date : 13/06/2012 12:15
Objet : Re: [ckan-discuss] Topic list (vocabulary/valuelist) for
classifyingdatasets .
Envoyé par : ckan-discuss-bounces at lists.okfn.org
Hi Pascal,
Thank you for your swift response.
Looking at the GEMET vocabulary, it seems to be limited to geographic
datasets. Naturally, many open datasets are geographic datasets.
However, there are also many datasets outside this domain. How would you
classify the datasets on the EU portal that are now under the current
categories Finance and Budgeting, Education and Communication, Economy
and Industry, Social Questions
Population & Health? The GEMET vocabulary does not seem to be suitable
for this.
A subset of the EUROVOC (while still not perfect, looking at the
datasets we have), seems to be a better fit for datasets outside the geo
domain, in my opinion. I can see a fairly big overlap between the
current EU portal topics and the EUROVOC topics, by the way.
These are the EUROVOC categories I am referring to, to avoid confusion.
04 POLITICS 08 INTERNATIONAL RELATIONS 10 EUROPEAN COMMUNITIES 12
LAW 16 ECONOMICS 20 TRADE 24 FINANCE 28 SOCIAL QUESTIONS 32
EDUCATION AND COMMUNICATIONS 36 SCIENCE 40 BUSINESS AND COMPETITION
44 EMPLOYMENT AND WORKING CONDITIONS 48 TRANSPORT 52 ENVIRONMENT 56
AGRICULTURE, FORESTRY AND FISHERIES 60 AGRI-FOODSTUFFS 64 PRODUCTION,
TECHNOLOGY AND RESEARCH 66 ENERGY 68 INDUSTRY 72 GEOGRAPHY 76
INTERNATIONAL ORGANISATIONS
The advantage of EUROVOC is that there are subcategories that can be
used for classifying datasets that do not fall into an obvious category.
For example: under what category should a hospital quality dataset fall?
Searching on 'Hospital' on the EUROVOC site gives a search result of
health policies of "MT 2841 HEALTH", so according to EUROVOC this falls
under "28 SOCIAL QUESTIONS". Still not perfect, but hey, no
catagorization system will be.
I'd be interested to hear your take on this.
Best regards,
Edo Plantinga - Data.overheid.nl
---------------------------------------------------------------
Bezoekt u binnenkort een locatie van de Rijksoverheid?
Dan dient u in het bezit te zijn van een geldige Rijkspas of een geldig
identiteitsbewijs (paspoort, nationale identiteitskaart, rijbewijs of
vreemdelingendocument). Indien u bij controle geen geldig
identiteitsbewijs kunt tonen, wordt de toegang geweigerd.
Legitimatiebewijzen van andere organisaties worden niet geaccepteerd.
Dit bericht kan informatie bevatten die niet voor u is bestemd. Indien u
niet de geadresseerde bent of dit bericht abusievelijk aan u is
toegezonden, wordt u verzocht dat aan de afzender te melden en het bericht
te verwijderen. De Staat aanvaardt geen aansprakelijkheid voor schade, van
welke aard dan ook, die verband houdt met risico's verbonden aan het
elektronisch verzenden van berichten.
This message may contain information that is not intended for you. If you
are not the addressee or if this message was sent to you by mistake, you
are requested to inform the sender and delete the message. The State
accepts no liability for damage of any kind resulting from the risk
inherent in the electronic transmission of messages.
_______________________________________________
ckan-discuss mailing list
ckan-discuss at lists.okfn.org
http://lists.okfn.org/mailman/listinfo/ckan-discuss
__________________________________________________________________
Ce message et toutes les pièces jointes sont confidentiels et établis à l'intention exclusive de ses destinataires. Ce message ne constitue pas un document officiel. Seuls les documents revêtus de la signature du Président du Conseil Général ou d'un de ses délégataires sont de nature à engager le Département.
Toute utilisation ou diffusion non autorisée est interdite. Tout message électronique est susceptible d'altération et le Département de la Gironde décline toute responsabilité au titre de ce message s'il a été altéré, déformé, falsifié.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20120613/d0d96dea/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 7254 bytes
Desc: not available
URL: <http://lists.okfn.org/pipermail/ckan-discuss/attachments/20120613/d0d96dea/attachment-0001.jpeg>
More information about the ckan-discuss
mailing list