[ECODP-dev] links of the mime type
BARGIOTTI Leda (OP)
Leda.BARGIOTTI at publications.europa.eu
Fri Aug 23 11:55:05 UTC 2013
Thanks to both of you, this really helps
Kind regards
Leda
From: David Raznick [mailto:david.raznick at okfn.org]
Sent: Friday, August 23, 2013 1:53 PM
To: Bert Van Nuffelen
Cc: BARGIOTTI Leda (OP); ISOARD Olivier (OP); PASTOR CAMARASA José Juan (OP); OP ODP CONTACT; Darwin Peltan; Project list for EC ODP CKAN project
Subject: Re: links of the mime type
Hello
Darwin is away today so will answer on his behalf
On 22 August 2013 21:23, Bert Van Nuffelen <bert.van.nuffelen at tenforce.com<mailto:bert.van.nuffelen at tenforce.com>> wrote:
Hi Leda,
thanks for reviewing this. I've put Darwin in cc so that he can follow this track.
Lets start with some context.
This json is the list of accepted values for the format and mime-type fields in CKAN.
Each entry in the json is of the form: <1> : [ <2>, <3>, <4> ] where
<1> = the value entered by the publisher
<2> = the unifying key (*)
<3> = the ( short ) label
<4> = the ( long ) label for description/tooltip purpose
(*) @Darwin, I am not sure of this.
Yes <2> is the value that we want to be the unifying key what we want to be stored in ckan.
So to answer the second question: yes there is a mixture as it is the union serving 2 fields.
In addition, this 'mixture' is enhanced by the standard CKAN approach to support the capturing of typical human ways of communicating data formats.
For example, we communicate amongst ourselves about html pages, so a human publisher using the web interface will enter typically 'html' instead of a standard technical notation 'text/html'.
In a machine automated process such relaxation is typically not done.
W.r.t. relaxation:
The currently be used formats in the EU ODP can be retrieved with this SPARQL query:
select distinct ?o where {?s <http://ec.europa.eu/open-data/ontologies/ec-odp#distributionFormat> ?o} limit 200
As you can see about 70-80% uses a technical mime-type representation. (But even that is not a guarantee for having one unique representation for the same data format: application/rdf+xml and rdf/xml)
The other use variations of the human format denotation.
>From a process point of view: -- @Darwin, correct me if I am wrong here --
If the json would be limited to exactly one line for the html page case:
"text/html": ["text/html", "HTML", "Web Page"],
only the value 'text/html' would be accepted in the CKAN database.
At the user interface level nothing would change. @Darwin, this is correct I believe?
@Darwin, which part of the json will be used for the RDF format representation: <1> or <2>?
We want the publishers to store <2>. The only reason that there are repeated values is that there are many historical formats that publishers have added that need to be mapped to the correct form. That mapping file is based on looking at the database and trying to map what was actually input historically. In a perfect world all that would be in column <1> would be values in column <2> but in the data currently that is not the case.
We have also added another file:
https://github.com/okfn/ckanext-ecportal/blob/next/data/resource_dropdown.json
This is what will end up in the front end form for publishers to select. This file is much cleaner and maps the value in <2> to a human readable format.
Thanks
David
I hope this clarifies a bit the information.
best regards,
Bert
ps. I am out of office until mondag evening, so further follow-up from my side on this will be done from tuesday on.
2013/8/22 BARGIOTTI Leda (OP) <Leda.BARGIOTTI at publications.europa.eu<mailto:Leda.BARGIOTTI at publications.europa.eu>>
Hi Bert,
Could you please explain this list?
1. More specifically, could you please tell us what each element means? For example:
"text/html": ["text/html", "HTML", "Web Page"]
• "text/html":: ?
• "text/html":?
• "HTML":?
• "Web Page":?
2. Secondly, it seems that this list is a mix of formats and other things such as "application/sparql-query"
3. Thirdly, how shall we interpret elements that are listed more than once? e.g.:
"text/html": ["text/html", "HTML", "Web Page"],
"htm": ["text/html", "HTML", "Web Page"],
"html": ["text/html", "HTML", "Web Page"],
"http://purl.org/net/mediatypes/text/html": ["text/html", "HTML", "Web Page"],
1. Fourthly: where does this list come from exactly? Is it an sprql query of the ODP?
If our objective is to have a list of file types to be used both in RDF and as a drop down list in the UI, we need to clearly understand which one is the code and which one is the label and if the same file type is listed more than once. Maybe you discussed this already with the other members of the team, but I would really appreciate if you could help me to shed some light into this, otherwise I will not be able to come up with a good list.
Thank you in advance
Kind regards
Leda
From: Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com<mailto:bert.van.nuffelen at tenforce.com>]
Sent: Thursday, August 22, 2013 4:29 PM
To: PASTOR CAMARASA José Juan (OP); BARGIOTTI Leda (OP)
Cc: ISOARD Olivier (OP)
Subject: links of the mime type
Hi José,
the list of mime-types you find online at https://github.com/okfn/ckanext-ecportal/blob/resource_formats/data/resource_mapping.json.
I added a downloaded version to this mail.
Bert
--
Bert Van Nuffelen
Semantic Technologies Software Architect at TenForce
www.tenforce.be<http://www.tenforce.be>
Bert.Van.Nuffelen at tenforce.com<mailto:Bert.Van.Nuffelen at tenforce.com>
Office: +32 (0)16 31 48 60<tel:%2B32%20%280%2916%2031%2048%2060>
Mobile:+32 479 06 24 26<tel:%2B32%20479%2006%2024%2026>
skype: bert.van.nuffelen
--
Bert Van Nuffelen
Semantic Technologies Software Architect at TenForce
www.tenforce.be<http://www.tenforce.be>
Bert.Van.Nuffelen at tenforce.com<mailto:Bert.Van.Nuffelen at tenforce.com>
Office: +32 (0)16 31 48 60<tel:%2B32%20%280%2916%2031%2048%2060>
Mobile:+32 479 06 24 26<tel:%2B32%20479%2006%2024%2026>
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130823/296a8e1b/attachment.html>
More information about the ecodp-dev
mailing list