[open-science] (Lack of) data sharing for fossil data

Peter Murray-Rust pm286 at cam.ac.uk
Fri Apr 15 14:05:33 UTC 2011

This looks like a good opportunity to discuss different ways of viewing the

On Fri, Apr 15, 2011 at 2:40 PM, Lance McKee <lmckee at opengeospatial.org>wrote:

> re:
>  The semantic problems you allude to are harder. PDF is an abomination in
>> science - it has no role. We're trying to develop scholarly HTML as the
>> right way to communicate science. It will need critical mass but we are as
>> always optimistic.
> PDF, no. HTML, no.
> XML, yes.
> XML yes as well. FWIW I was one of the early developers of XML in 1997 and
ran the XML-DEV mailing list. I have developed (with very few assistants)
Chemical Markup Language over the last 17 years. So I support XML for
certain domains - chemistry, geoscience, and other physical sciences.

But there are very few absolutes in this area.

Other domains such as bioscience rely much more on words and less on
structure. So HTML is a natural way of communicating ideas. It's universal
and - with goodwill at both ends of the chain (author and reader) HTML can
manage most of what is necessary. The great thing about HTML is that any
device can render it and allow reuse. XML usually requires specialist tools.
So If I send you a CML file, the first thing that you are likely to do is
ask where you can get a toolchain (yes, it exists and there is a lot of my
blood in it). If you send me xml-encoded ISO standard metadata I will have
to ask for help.

> Every domain needs a metadata council.

Probably true. But some are top down (a political activity) and some are
bottom up (i.e. the discipline has no interest or insight into doing it).
Top down works, often very slowly. Bottom-up often does not work but when it
does moves rapidly.

> Ultimately, it's about profiles of the ISO metadata standards and web
> service interfaces for catalogs that enable the xml-encoded ISO standard
> metadata -- and links to the data (and data processing services) -- to be
> published, discovered, assessed, accessed and used. See, for example, the
> Marine Metadata Interoperability Project (http://marinemetadata.org/).
> Ultimately, this approach enables chaining of web services in models that
> draw on multiple remote resources.
> This model works for geoscience. It does not work for bioscience (fossils,
trees). It does not work for chemistry. Each domain is a muddle of designed
semantics, evolved semantics, commercial forces, learned societies, etc.
Each domain will provide different ways of disseminating their data and

There is no single one-fits-all solution. For several decades  there are
likley to be per-domain solutins with more-or-less funding, more or less
volunteer activity. In chemistry it's been a very hard strufggle against the
forces of inertia and walled-garden commercialism (who benefit from keeping
the subject in the dark ages).


Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20110415/4fa1706a/attachment-0001.html>

More information about the open-science mailing list