[okfn-labs] How to harvest scans from BL?
Justin York
justincyork at gmail.com
Fri Dec 20 21:22:05 UTC 2013
They use Sanddragon <http://sanddragon.bl.uk/> which appears to the be BL's
own take on Microsoft's Deep Zoom. The code is open source so you should be
able to figure out how the API works for serving up images as Enric
described. Chances are there's a full resolution image in there somewhere
too.
On Fri, Dec 20, 2013 at 2:00 PM, Enric Garcia Torrents
<enricgarcia at uoc.edu>wrote:
>
> For what I see the Item Viewer image is made of a collection of jpgs. If I
> am not wrong, they have cut the pages into small sections. The problem is
> not that much scrapping those sections, but putting the pieces back
> together to reconstruct each page. As an example, here are several pieces
> of the 10th page of the book of your link:
>
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/0,2048,1028,1028/pct:25/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/1024,2048,1028,1028/pct:25/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/1024,0,1028,1028/pct:25/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/1024,1024,1028,1028/pct:25/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/0,0,4112,4112/pct:6.25/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/0,0,2056,2056/pct:12.5/0/native.jpg
>
> http://access.bl.uk/IIIFImageService/ark:/81055/vdc_000000011B9A.0x00000A/0,2048,2056,2056/pct:12.5/0/native.jpg
>
> They seem to be using coordinates system to name the folders, so their
> item viewer can put them back together. It would take a little patience to
> figure out their system. Their algorithm should be replicated.
>
>
> Best regards,
>
> Enric G. Torrents
> Email: e.g.cn at ieee.org
> Tel.: +8613122141470
> Skype: torrents.enric
> <http://cn.linkedin.com/in/enrictorrents/>
> cn.linkedin.com/in/enrictorrents/
>
>
> --- Missatge original de Lars Aronsson <lars at aronsson.se> per a
> okfn-labs at lists.okfn.org enviat el 20.12.2013 21:34
>
> The British Library has scanned some books that they
> let you download as PDFs. However, the PDFs are in
> lower resolution that the scans that are displayed
> in their online 'item viewer'.
>
> Has anyone been able to harvest or scrape the full
> resolution images from the item viewer?
>
> Here is one such book,http://explore.bl.uk/primo_library/libweb/action/search.do?vl%28freeText0%29=000507311&fn=search
>
> Under "2 related resources" (red link at the right),
> you will find two items where "I want this" for the
> second one gives you a PDF or an 'item viewer'.
> (Tell me if there is a short URL for this.)
>
> Here's a sample from the PDF,http://runeberg.org/elfsyssel/cow-pdf.png
>
> The same sample from the item viewer,http://runeberg.org/elfsyssel/cow-viewer.png
>
>
> --
> Lars Aronsson (lars at aronsson.se)
> Project Runeberg - free Nordic literature - http://runeberg.org/
>
>
> _______________________________________________
> okfn-labs mailing listokfn-labs at lists.okfn.orghttps://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20131220/c63a7743/attachment-0004.html>
More information about the okfn-labs
mailing list