[openbiblio-dev] What to do with the MARC data?

Sun Feb 20 14:59:57 UTC 2011

Dan, I don't think a lossless transform exists. However, I've been  
working on an analysis of MARC21 that could produce a less lossy  
transform. The 00x data elements have been completed.

for the general analysis, see:
   http://futurelib.pbworks.com/w/page/29114548/MARC-elements

You can find a link to the 00x's in RDF on this page:
   http://futurelib.pbworks.com/w/page/36289829/Resulting-Data

One of the main aspects of this is that the URI links back (logically)  
to the actual fixed field value:

   http://marc21.info/vocab/007map03/007map03a

I've done the analysis of the 0XX variable fields, but they haven't  
been RDF-ized yet. I don't think it makes sense to define those with  
SKOS, so I need some help thinking that one through. I'll put a  
display version up on that Resulting-Data page.

kc

Quoting Dan Sheppard <dan.sheppard at caret.cam.ac.uk>:

> Dear all,
>
> We're looking at migrating a bucket-load of MARC to RDF, probably using
> the bibliographia stuff, and like the idea of putting it all into dc:,
> skos:, etc, as they have done themselves. This is a given, however...
>
> ...I quite like my transforms to be lossless, and there seems to be an
> opportunity to bundle all the MARC biblio stuff, alongside the more
> familiar stuff. I imagine it's really irritating to find that for your
> specialist application you're missing some subfield or indicator which has
> been dropped or merged. And there seem to be few disadvantages to chucking
> it alongside.
>
> One option (which we may well take, in addition) is to bundle MARC records
> themselves. However, it seems cruel to dump some poor guy who just needs
> to know his parchment from his papyrus, her globe from her atlas, back
> into the rickety world of MARC. It would also reduce arguments in
> tea-breaks over mappings.
>
> Transforms via MODS are natural, and give a low-resistance path, but still
> isn't 1-to-1. We could hunt down some URL space for fields and subfields,
> but there seem to be issues that order is important [100 a then d then a
> then d, needs to be captured, ideally via intermediate nodes (a d),(a
> d),...] and the use of indicators is screwily all over the place (some are
> key-like, some value-like, some like nothing else in this world). The
> great things about crosswalks is that there are so many to choose from. It
> all starts to look like a nightmare.
>
> So I was wondering, does anyone know of a good transform (ideally with
> code) which maps MARC Biblio losslessly to RDF such that the sawn woman
> can be reassembled, RDF -> MARC? (Not that this is necessary, but is
> sufficient for all of the above).
>
> Someone must have done a good data-model analysis of MARC Biblio from a
> Comp Sci perspective regarding the above, and if they've operationalised
> it, that would be brilliant, too.
>
> If not, it looks like this marginal nice-to-have addition to the main
> stream would be sufficiently doom-ridden that we'll have to leave it out.
> The main advantage of it for me is the stopping the "dumbing down" debate
> over the crosswalking, so doesn't need end-user analysis so much as %age
> of chatter analysis!
>
> Dan.
>
>
>
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>

-- 
Karen Coyle
kcoyle at kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet