[open-bibliography] CUL dataset release

Edward Betts edward at archive.org
Mon Oct 18 22:40:14 UTC 2010


On 05/10/10 12:07, Mark MacGillivray wrote:
> We would like to bring your attention to a post on the JISC Open
> Bibliography project blog:
> 
> http://openbiblio.net/2010/10/05/jisc-openbibliography-cul-data-release/
> 
> In which we are very happy to announce the release of a dataset from
> Cambridge University Library under PDDL licence. This dataset contains
> approx. 180000 records of MARC data.
> 
> The details are on the post, including links to a CKAN package and the
> raw data (one file, zipped), and we hope to announce further
> developments in the near future.

I analysed the data, I see 132130 total records, 3 maps, 132127 books

The maps look like this:

leader: 00817cem a2200181 a 4500
001 4405728
005 20070516082534.0
008 070515s1872    enk       a     0   eng d
100 1  $aCook, Robert J.,$cLithographer.
245 10 $aLondon railway travelling made easy$h[cartographic material] :$bshowing at a glance what station to go to, to get to any part of London or suburbs /$cby Robert J. Cook
250    $aFifth edition.
255    $aScale [1:31,680]
260    $aLondon :$bRobt .J. Cook & Hammond$c[1872?]
300    $a1 map :$b2 col. ;$c68 x 86 cm. folded to 21 x 12 cm.
500    $aScale statement reads: 2 inches to a mile.
500    $aShows railways and stations in London, with some information on routes and frequency of trains.
650  0 $aRailroads$zEngland$zLondon$vMaps.
948 1  $a20070515$bre261$cULTWR-h$dz

leader: 00643nem a2200157 a 4500
001 4493408
005 20071030095654.0
007 aj canzn
008 071030s1884    enk       a   r 0   eng d
245 00 $aPhilips' new map of Liverpool and its environs$h[cartographic material] :$bincluding Bootle, Walton, West Derby and Wavertree.
255    $aScale [1:10,560]
260    $aLondon ;$aLiverpool :$bGeorge Philip & Son,$c[1884?]
300    $a1 map :$bcol. ;$c59 x 93 cm. folded to 20 x 13 cm.
500    $aShows city of Liverpool and surrounding districts, with directory of churches.
651  0 $aLiverpool (England)$vMaps.
948 1  $a20071030$bsec63$cULTWR-h$dz

leader: 01062nem a2200229 a 4500
001 4557264
005 20080229095841.0
007 aj canzn
008 080229s1891    enk       a     1   eng d
100 1  $aBacon, G. W.$q(George Washington),$d1830-1921.
245 10 $aBacon's new map of London$h[cartographic material] :$bdivided into half mile squares and circles.
246 14 $aBacon's new shilling map of London and illustrated guide
246 30 $aNew map of London
255    $a[Scales vary]
260    $aLondon :$bG.W. Bacon & Co.,$c[1891?]
300    $a3 maps :$b3 col. ;$c17 cm. x 22 cm. and 69 x 87 cm. folded to 18 x 13 cm. +$eguide and index (64 p.)
500    $a64 p. booklet and a folded map in a cover.
500    $aScale statements read: 9 inches to a mile and Four inches to the mile.  Railway map is 3 miles to the inch.
500    $a"With a large scale map of the City, a railway map of the environs, street directory, cab fares, &c." -- cover.
504    $aIncludes indices.
651  0 $aLondon (England)$vMaps.
948 1  $a20080229$bmak38$cULTWR-h$dz

Here are some more technical observations:

http://ckan.net/package/jiscopenbib-cul-1 links to http://openbiblio.net/2010/10/04/jisc-openbibliography-cul-data-release which gives a 404.

Also wget doesn't like the http://storage.ckan.net/openbiblio/CUL_dataset1-20100705 URL.

It gives me this error:

ERROR: certificate common name `*.googleusercontent.com' doesn't match requested host name `openbiblio.commondatastorage.googleapis.com'.

wget works when --no-check-certificate is used.

Here is a full transcript:

$ wget  http://storage.ckan.net/openbiblio/CUL_dataset1-20100705
--2010-10-18 14:54:38--  http://storage.ckan.net/openbiblio/CUL_dataset1-20100705
Resolving storage.ckan.net... 88.198.21.211
Connecting to storage.ckan.net|88.198.21.211|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://openbiblio.commondatastorage.googleapis.com:443/CUL_dataset1-20100705?Signature=bbDZK76w6yo2mLuTxdMLkL7r4Y8%3D&Expires=1287438856&AWSAccessKeyId=GOOGC6OU3AYPNY47B66M [following]
--2010-10-18 14:54:40--  https://openbiblio.commondatastorage.googleapis.com/CUL_dataset1-20100705?Signature=bbDZK76w6yo2mLuTxdMLkL7r4Y8%3D&Expires=1287438856&AWSAccessKeyId=GOOGC6OU3AYPNY47B66M
Resolving openbiblio.commondatastorage.googleapis.com... 209.85.225.132
Connecting to openbiblio.commondatastorage.googleapis.com|209.85.225.132|:443... connected.
ERROR: certificate common name `*.googleusercontent.com' doesn't match requested host name `openbiblio.commondatastorage.googleapis.com'.
To connect to openbiblio.commondatastorage.googleapis.com insecurely, use `--no-check-certificate'.
$ 

-- 
Edward.




More information about the open-bibliography mailing list