[open-bibliography] New BNB sample data available
Owen Stephens
owen at ostephens.com
Fri Feb 4 15:07:45 UTC 2011
Thanks for the clarification Corine - apologies for my misreading of the
intention.
I'd just add my +1 to what Antoine said (all points I think)
Owen
On Fri, Feb 4, 2011 at 2:50 PM, Deliot, Corine <Corine.Deliot at bl.uk> wrote:
> Hi Antoine,
>
> Many thanks for that.
>
> Just to clarify (in case the folks at LC are wondering!), I was making a
> general point about the permanence of linked data sets. I'm not worried
> about id.loc.gov being put offline. [but you knew that really ;-)]
>
> Best wishes
>
> Corine
>
> -----Original Message-----
> From: Antoine Isaac [mailto:aisaac at few.vu.nl]
> Sent: 04 February 2011 14:36
> To: Deliot, Corine
> Cc: List for Working Group on Open Bibliographic Data; public-lld
> Subject: Re: [open-bibliography] New BNB sample data available
>
> Hello Corine,
>
> Re. 1 and 2, in fact your decision not to put the language tags is what
> saves you from the inconsistency Andrew has warned about. If you were
> using the same language tag as id.loc.gov, but a different literal (and
> adding one dot to a literal makes it an entirely different literal),
> then your data would be inconsistent with the id.loc.gov one.
>
> Now, on having a language tag or not, I see your issue, but personally
> I'm ok with originally Spanish labels being considered as English ones,
> if there's no English translation for them.
> Anyway, the core issue to me here is that this language tag dilemma also
> applies for LoC, which made the opposite choice. Ideally if you publish
> data on LC concepts, it should be compatible with what LC
> has--"compatible" in the formal but also informal way: whether there is
> an inconsistency or not, a data consumer may still be extremely puzzled
> why LC and BL can't agree on their concepts' prefLabels!
>
> Re. 3, getting data for indexing is a very valid concern. But it also
> could be done just before the indexing step, not in the data you
> publish. But well, you are perhaps in the best position to judge: as you
> have put it, this is about what you feel you should provide to your
> typical data consumers. Note, however, that putting the labels
> re-introduces the risk of being out-of-synch with a central repository,
> which you correctly identified in your first move.
>
> About the danger of a target source being put offline, that is also a
> valid point. But for id.loc.gov I wouldn't be so worry. In fact, BL
> starting to rely on it for its data would be a key motivation for LC not
> to put it offline :-)
>
>
> Re. your last question, I guess I can only repeat what I've written
> above. My gut feeling would be to replicate as little as possible:
> ideally, the URI should be the only thing present in your data! But if
> you have clear ideas about the amount of efforts your data consumers
> would be willing to undergo, you should adapt your data to make their
> life easier.
> Note that the data consumers who'd be interested in such caching might
> be the ones interested in accessing large dumps of data at once. So the
> "true linked data version" (what you get when following your nose over
> HTTP) could include only the URIs, but a fit-for-purpose dump of your
> entire catalogue may include a bit more.
>
> Best,
>
> Antoine
>
>
>
> > Hi Antoine and all,
> >
> > Many thanks for the feedback and apologies for the length of this
> email.
> >
> > In answer to the questions about
> > <dcterms:subject>
> >>> <rdf:Description
> >>> rdf:about="http://id.loc.gov/authorities/sh2008107012#concept">
> >>> <skos:inScheme
> >>> rdf:resource="http://id.loc.gov/authorities#conceptScheme" />
> >>> <skos:prefLabel>Literary landmarks--England--
> >>> London.</skos:prefLabel>
> >>> <rdf:type
> >>> rdf:resource="http://www.w3.org/2004/02/skos/core#Concept" />
> >>> </rdf:Description>
> >>> </dcterms:subject>
> >
> > And
> >
> > 1. why does the literal value contained in<skos:prefLabel> Literary
> landmarks--England--
> > London.</skos:prefLabel> does not exactly match the one served by LC
> at id.loc.gov for http://id.loc.gov/authorities/sh2008107012#concept?
> >
> > The answer is that it should. We've matched the LCSH heading contained
> in the bibliographic record to the LCSH heading in the authority file.
> The issue is to do with punctuation (which is input at the end of the
> heading in the bib record but is not part of the heading in the
> authority file). We'll address this in the conversion - this is an issue
> in the LCSH headings and I believe in other parts of our output. [So no,
> we "are *not* essentially trying to say which of the SKOS preflabels the
> BL prefers" as one post tried to double-guess]
> >
> > 2. Why does our output does not include the xml:lang="en"
> in<skos:prefLabel>
> > This is because in some cases this xml:lang="en" whilst true to the
> data served up by id.loc.gov is actually not correct. For example, if
> you look at
> > <http://id.loc.gov/authorities/sh94003128#concept> for Parque
> Nacional Torotoro (Bolivia), we have
> > <skos:prefLabel xml:lang="en">Parque Nacional Torotoro
> (Bolivia)</skos:prefLabel>
> >
> > instead of Spanish.
> >
> > I assume the reason for that is that there isn't the granularity in
> MARC 21 - where these headings originates from - to code the language of
> each data element. So when LC expresses LCSH in SKOS, they couldn't
> specify and went for the language of the majority of the headings, which
> is English.
> >
> > So we - ok, I ;-) thought we could do "without" the xml:lang attribute
> since it wasn't "correct" in all cases. I didn't realise the
> implications.
> >
> > 3. Why are we outputting both the literal value and the resource URI?
> > In a very first attempt, we'd only included the resource URI as you
> suggest. They were concerns about the two being out of sync., e.g. when
> a LCSH is updated. In fact, this is one of the uses of those URIs -
> enabling easier updating of bibliographic data.
> >
> > But we got some advice to the contrary. Some linked data platforms
> index the literal values to improve searching; it was also pointed out
> that there may be a risk of the linked dataset we link to
> "disappearing".
> >
> > There are other considerations: we are putting our data out for people
> to use and re-use; and we are not too sure what they want to do with it
> yet - so as you suggest, some of them may not want or be able to go and
> fetch data from id.loc.gov. or any other data sets we link to. A related
> question is to do with the time and resources to produce these files. At
> the moment, we are concentrating on the BNB but the intention is to work
> on other data sets. We are currently working on two versions of the
> file, a "non-URI" and a "with added-URI" version of the data and
> ideally, it would be good to have only one version - the "with
> added-URI" one - to maintain/produce if it meets the needs of all/most
> people.
> >
> > Now it's my turn for a question ;-)
> >
> > In your feedback, you highlight the risk of "that your data is less
> complete than the one of other services"[1] e.g., if you don't have
> skos:broader that id.loc.gov has for LCSH concepts.
> >
> > So to take the example of LCSH at id.loc.gov, how much of the data
> included there should I replicate in my instance data? Isn't
> the<skos:prefLabel> and the resource URI sufficient? If you need other
> info, like<skos:altLabel> or<skos:broader>, won't you be able to fetch
> it via the resource URI?
> >
> > That's it for now ;-)
> >
> > I would also like to say that from later today I shall be offline for
> the next two weeks. So that people don't think we don't want to engage
> or anything like that if there is no post. I really appreciate feedback.
> >
> > Cheers
> >
> > Corine
>
> _______________________________________________
> open-bibliography mailing list
> open-bibliography at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/open-bibliography
>
--
Owen Stephens
Owen Stephens Consulting
Web: http://www.ostephens.com
Email: owen at ostephens.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-bibliography/attachments/20110204/6a3e1d32/attachment-0001.html>
More information about the open-bibliography
mailing list