[ECODP-dev] links of the mime type

Bert Van Nuffelen bert.van.nuffelen at tenforce.com
Thu Aug 22 20:23:25 UTC 2013


Hi Leda,

thanks for reviewing this. I've put Darwin in cc so that he can follow this
track.

Lets start with some context.

This json is the list of accepted values for the format and mime-type
fields in CKAN.

Each entry in the json is of the form: <1> : [ <2>, <3>, <4> ] where
   <1> = the value entered by the publisher
   <2> = the unifying key (*)
   <3> = the ( short ) label
   <4> = the ( long ) label for description/tooltip purpose

 (*) @Darwin, I am not sure of this.

So to answer the second question: yes there is a mixture as it is the union
serving 2 fields.
In addition, this 'mixture' is enhanced by the standard CKAN approach to
support the capturing of typical human ways of communicating data formats.
For example, we communicate amongst ourselves about html pages, so a human
publisher using the web interface will enter typically 'html' instead of a
standard technical notation 'text/html'.
In a machine automated process such relaxation is typically not done.

W.r.t. relaxation:

The currently be used formats in the EU ODP can be retrieved with this
SPARQL query:
select distinct ?o where {?s <
http://ec.europa.eu/open-data/ontologies/ec-odp#distributionFormat> ?o}
limit 200

As you can see about 70-80% uses a technical mime-type representation. (But
even that is not a guarantee for having one unique representation for the
same data format: application/rdf+xml and rdf/xml)
The other use variations of the human format denotation.

>From a process point of view: -- @Darwin, correct me if I am wrong here --
If the json would be limited to exactly one line for the html page case:
"text/html": ["text/html", "HTML", "Web Page"],

only the value 'text/html' would be accepted in the CKAN database.

At the user interface level nothing would change. @Darwin, this is correct
I believe?

@Darwin, which part of the json will be used for the RDF format
representation: <1> or <2>?

I hope this clarifies a bit the information.

best regards,

Bert

ps. I am out of office until mondag evening, so further follow-up from my
side on this will be done from tuesday on.


2013/8/22 BARGIOTTI Leda (OP) <Leda.BARGIOTTI at publications.europa.eu>

>  Hi Bert,****
>
> ** **
>
> Could you please explain this list?****
>
> ** **
>
> **1.       **More specifically, could you please tell us what each
> element means? For example:****
>
> ** **
>
> "text/html": ["text/html", "HTML", "Web Page"]****
>
> ** **
>
> **·         **"text/html":: ?****
>
> **·         **"text/html":?****
>
> **·         **"HTML":?****
>
> **·         **"Web Page":?****
>
> ** **
>
> **2.       **Secondly, it seems that this list is a mix of formats and
> other things such as "application/sparql-query"****
>
> ** **
>
> **3.       **Thirdly, how shall we interpret elements that are listed
> more than once? e.g.:****
>
> "text/html": ["text/html", "HTML", "Web Page"],****
>
> "htm": ["text/html", "HTML", "Web Page"],****
>
> "html": ["text/html", "HTML", "Web Page"],****
>
> "http://purl.org/net/mediatypes/text/html": ["text/html", "HTML", "Web
> Page"],****
>
> ** **
>
> **1.       **Fourthly: where does this list come from exactly? Is it an
> sprql query of the ODP?****
>
> ** **
>
> If our objective is to have a list of file types to be used both in RDF
> and as a drop down list in the UI, we need to clearly understand which one
> is the code and which one is the label and if the same file type is listed
> more than once. Maybe you discussed this already with the other members of
> the team, but I would really appreciate if you could help me to shed some
> light into this, otherwise I will not be able to come up with a good list.
> ****
>
> ** **
>
> Thank you in advance****
>
> ** **
>
> Kind regards****
>
>
> Leda****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> *From:* Bert Van Nuffelen [mailto:bert.van.nuffelen at tenforce.com]
> *Sent:* Thursday, August 22, 2013 4:29 PM
> *To:* PASTOR CAMARASA José Juan (OP); BARGIOTTI Leda (OP)
> *Cc:* ISOARD Olivier (OP)
> *Subject:* links of the mime type****
>
> ** **
>
> Hi José,****
>
> the list of mime-types you find online at
> https://github.com/okfn/ckanext-ecportal/blob/resource_formats/data/resource_mapping.json
> .****
>
> I added a downloaded version to this mail.****
>
> Bert
> ****
>
>
> --
> Bert Van Nuffelen
>
> Semantic Technologies Software Architect at TenForce
> www.tenforce.be
>
> Bert.Van.Nuffelen at tenforce.com
> Office: +32 (0)16 31 48 60
> Mobile:+32 479 06 24 26
> skype: bert.van.nuffelen ****
>



-- 
Bert Van Nuffelen

Semantic Technologies Software Architect at TenForce
www.tenforce.be

Bert.Van.Nuffelen at tenforce.com
Office: +32 (0)16 31 48 60
Mobile:+32 479 06 24 26
skype: bert.van.nuffelen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.okfn.org/mailman/private/ecodp-dev/attachments/20130822/3e69368f/attachment.html>


More information about the ecodp-dev mailing list