[openbiblio-dev] invalid rdf in the BL dataset?

Mo McRoberts Mo.McRoberts at bbc.co.uk
Fri Feb 18 13:31:40 UTC 2011

[I don't know if BL metadata folk lurk on the list, so I've CC'd]

Apologies in advance if this steps on anybody's toes!

Dan Sheppard wrote:

> Obviously, it would also be good to get this fixed in the original data
> set, so none of this should detract from your excellent effort to find
> someone who can fix it!

The data has actually been fixed -- though it's easy to forgive not knowing that. For the sake of brevity, there's a more up to date copy at:


However, that host (as its name suggests) isn't really geared to hosting things on any sort of useful scale or longevity, so it should most definitely not be considered a permanent home! Feel free to grab it for immediate needs, though.

This does highlight a need to adjust processes, however. There's a little bit of a dichotomy between the release of the open data and regularly updating it, and the process of requesting a username and password to download it from an FTP site. As everybody's very much interested in data preservation, and the data itself is released under a friendly license, it's no surprise that it's been mirrored. It would be difficult to argue that this is a bad thing, all things considered!

In this tweet yesterday:


...it was noted:

"We aim to produce a new version of of our RDF/XML data on a monthly basis. Contact metadata at bl.uk if you'd like the latest updated file.."

Given that the data is being updated on a regular basis, and that people will mirror it (for convenience if nothing else), it's important, I think, to work out a process by which these can happen and satisfies all interests. Obviously, I don't speak for anybody but myself, but I would presume that the British Library would rather it was up to date data doing the rounds than outdated -- and possibly flawed -- versions!

BL folk -- would it be possible to host a copy of the latest ZIP (which would include, as it does now, the overview PDF) somewhere reachable via unauthenticated HTTP GET, rather than FTP? That way, you can be sure that the package at:


...always points to the latest release (and this seems to be the principal location that people are looking at for this data). Perhaps something could be worked out between metadata at bl and Mark MacGillivray (as current maintainer of the package on CKAN)? It'd be good to see this sorted out relatively swiftly, and I think it's probably in everyone involved's interests if that can happen.

All the best,


Mo McRoberts - Data Analyst - Digital Public Space
Zone 1.08, BBC Scotland, 40 Pacific Quay, Glasgow G51 1DA
Room 7066, BBC Television Centre, London W12 7RJ
0141 422 6036 (Internal: 01-26036)

This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

More information about the openbiblio-dev mailing list