[open-science] [Open-access] OKF at Open Repositories 2014

Peter Murray-Rust pm286 at cam.ac.uk
Thu Dec 5 15:27:47 UTC 2013


On Thu, Dec 5, 2013 at 2:48 PM, Rafael Pezzi <rafael.pezzi at ufrgs.br> wrote:

>  Em 05-12-2013 06:18, Peter Murray-Rust escreveu:
>
>  My personal interests would include:
>
>  * problems of legacy data formats
>  * adding structure to semi-structured data
>  * indexing,particularly domain-specific information
>
>
> Peter,
>
> I would emphasize the need of open data formats and also open tools to
> play with these data as well.
>

Open data formats are not in doubt. But the absence of tools should not
hold us back. Much of this will be name-value pairs - even if the name is
not in an ontology it's useful. Thus:

species="Erithacus rubecula"

may not resolve against an RDF triple store but it's a lot better than zero.


>
> The repository will be of little or no value at all if the data is coded
> on a obscure format whose corresponding software is unavailable, out of
> market, or inviable expensive.
>

You misread me :
" problems of legacy data formats" did not mean I want to translate
something useful into obscure binary commercial legally protected DRM.
There are enough chemical software companies that do that. I want to
translate obscure binary protected formats into Unicode-compliant ASCII and
where possible indexed again known formal ontologies such as Chemical
Markup Language.

Improving the semantics of data will encourage people to build tools. They
are mutually dependent but we have to start somewhere.


> Here I point to the Science Code Manifesto<http://sciencecodemanifesto.org/>,
> which, I believe, must walk in hands with any data repository.
>
> Furthermore, reproducible research also needs reproducible instruments and
> experiments, thus I am particularly interested in the design of scientific
> instrumentation, including CAD drawings, schematics, firmwares,
> specifications, that must be accessed through open repositories as well.
>
>
I'd strongly support this - but it's a lot of design and a lot of
implementation. And we are often starting from scanned pixel pages or
broken PDF. Anything that takes those forward is IMO valuable.



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20131205/8d82fc6f/attachment.html>


More information about the open-science mailing list