[open-bibliography] Extra RDF datatypes

Christopher Gutteridge cjg at ecs.soton.ac.uk
Sat Nov 20 19:39:29 UTC 2010


I deliberately hosted it via purl.org so that if needed it can be taken 
on by a group to develop. I'm aware of all the potential niggles about 
versions but IMO this is over engineering. I want a system which can 
tell me if something is plain text (not marked up at all), or some 
flavor of HTML.

Beware the siren call of The Modeller.
http://blogs.ecs.soton.ac.uk/webteam/files/2010/10/173285998-246x300.png

In xtypes you can specify something as a subclass if you really must, eg.
http://purl.org/xtypes/Fragment-LaTeX-2.1.23.ourSpecialVariation

But I think trying to formalise URIs for not only every version of these 
fragments, but every controlled subset... That way lies madness.

Jim Pitman wrote:
> Christopher Gutteridge <cjg at ecs.soton.ac.uk> wrote:
>
>   
>> Something that's been an ongoing issue for us in expressing bibliographic data in RDF, often we have fragments of text which contain 
>> markup mixed with ones that don't. HTML & LaTeX fragments being the most 
>> common.
>>
>> http://purl.org/xtypes/
>>
>> defines new datatypes to express this, eg.
>>
>> <dc:description rdf:datatype='http://purl.org/Fragment-HTML'>Hello 
>> <b>World</b></dc:description>
>> <dc:description rdf:datatype='http://purl.org/Fragment-PlainText'>Hello 
>> World</dc:description>
>> PlainText explicitly indicates an absence of markup, other than 
>> CF,LF,Tab and others which are part of the character set.
>> I've been waiting years for a solution to this one, so I've given up and done it myself.
>>     
>
> Me too. Many thanks for this initiative!
> I have the same problem with BibJSON http://www.bibkn.org/bibjson/index.html
> and would be glad to coordinate on this. In JSON I've never got much better than
> tagging keys with a format indicator e.g.  "title_html",  "title_tex", ...
> This is not very JSONish, but it is at least easily encoded in BibTeX which is an advantage.
> My concern with an RDF implementation is that it may quickly get very heavy. Note that with any
> format indication, there are nuances of meaning which may need to be specificied. e.g. it is useful
> to know for display purposes if html is limited to just a few tags, and there are different versions of tex, latex, ...
> Having a pointer to a rigorous format description that will allow consumers of the data to process it with
> out too much burden of checking format variations would be very useful.
>
> --Jim
>
> ----------------------------------------------
> Jim Pitman
> Director, Bibliographic Knowledge Network Project
> http://www.bibkn.org/
>
> Professor of Statistics and Mathematics
> University of California
> 367 Evans Hall # 3860
> Berkeley, CA 94720-3860
>
> ph: 510-642-9970  fax: 510-642-7892
> e-mail: pitman at stat.berkeley.edu
> URL: http://www.stat.berkeley.edu/users/pitman
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>   

-- 
Christopher Gutteridge -- http://id.ecs.soton.ac.uk/person/1248

You should read the ECS Web Team blog: http://blogs.ecs.soton.ac.uk/webteam/





More information about the open-bibliography mailing list