[open-bibliography] Place of Publication data from the BL dataset

Karen Coyle kcoyle at kcoyle.net
Fri Nov 26 08:03:03 UTC 2010


Ben, Edward Betts did some interesting work on disambiguating (or not,  
as is often the case) publishers using the ISBN prefixes. Obviously,  
it only helps when you have an ISBN... he also did some stats on  
country of publication. You can find various links here:

http://edwardbetts.com/ol/

The one relating ISBN codes and publisher names is:

http://edwardbetts.com/ol/isbn_publisher_codes

I find these kinds of studies absolutely fascinating, in part because  
they point out how ambiguous our library data really is, often because  
it's a funny world, publishing is.

kc

Quoting Ben O'Steen <bosteen at gmail.com>:

> (And as Karen has just pointed out, the reason why I am exploring this field
> is to aid disambiguation of publishers. Having created the overview that I
> know I need,  I thought to share it here.)
> On 26 Nov 2010 07:27, "Karen Coyle" <kcoyle at kcoyle.net> wrote:
>> Quoting William Waites <ww at eris.okfn.org>:
>>
>>
>>>
>>> However, suppose you look at a book and figure out in whatever way
>>> that it was published in Cambridge, Ontario (for argument's sake),
>>> what is necessary to hook that on the records is ultimately a SPARQL
>>> query that looks like,
>>
>> Do the original records have the place of publication code from the
>> fixed field? That provides the country (and in some cases country +
>> state or province). It could be used to disambiguate place names in
>> the publisher area.
>>
>> kc
>>
>>>
>>> INSERT INTO <book_uri>
>>> { ?place owl:sameAs <http://sws.geonames.org/5913695/> }
>>> WHERE
>>> { <book_uri> isbd:hasPlaceOfPublicationProductionDistribution ?place }
>>>
>>> Or even better, loop over all books with the same publisher and place
>>> name label and perform the same operation.
>>>
>>> This way, using owl:sameAs like this to ground a blank node is a first
>>> step in disambiguation. It only adds a piece of information and
>>> doesn't remove anything. Once we are sure enough that this is correct,
>>> we can go and replace the blank node with the URI which is a more
>>> invasive operation because it involves deleting statements.
>>>
>>> If we can get there, some sort fun game that people can play that
>>> creates SPARQL queries like this, we can fix the data.
>>>
>>> The nice thing is, when we record provenance (changes) we can keep
>>> around these queries that were done. They are much clearer and
>>> understandable (especially if they become only slightly more elaborate
>>> than the one above) than the brute-force transaction journal
>>> (changeset) approach to provenance.
>>>
>>> Cheers,
>>> -w
>>> --
>>> William Waites
>>> http://eris.okfn.org/ww/foaf#i
>>> 9C7E F636 52F6 1004 E40A E565 98E3 BBF3 8320 7664
>>>
>>> _______________________________________________
>>> open-bibliography mailing list
>>> open-bibliography at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/open-bibliography
>>>
>>
>>
>>
>> --
>> Karen Coyle
>> kcoyle at kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>>
>> _______________________________________________
>> open-bibliography mailing list
>> open-bibliography at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-bibliography
>



-- 
Karen Coyle
kcoyle at kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet





More information about the open-bibliography mailing list