[openbiblio-dev] Plotting timelines of the 'birth' of IUCr articles on a map

Peter Murray-Rust pm286 at cam.ac.uk
Mon Jan 10 12:44:39 UTC 2011


On Mon, Jan 10, 2011 at 12:23 PM, William Waites <ww at eris.okfn.org> wrote:

> Hi Ben, I'm copying Richard Pope on this mail

[He wasn't obviously copied so I have replied with his address]


> because I'm
> not sure if he's on this list -- he's looking at some UI
> stuff for bibliographica andyour js examples doing sparql
> mashup things might prove useful to him.
>
> Both Ben and Richard have done good proof-of-concept mashups for
bibliographic data against geo/time. This will be very valuable for showing
to a wider audience on January 17th including several JISC people (we hope).



> Another question is, can we get at a dump of this data to
> see about including it in the bibliographica store itself?
>

There is a lot more data potentially than this. This is a smallish subset of
scientific biblio and we can and should think about millions of data.

>
> And lastly, have you managed to do anything about getting
> a nquads dump out of the store for the BNB data? We need
> this to provide to some services to stop them crawling the
> whole thing and as well to provide to FU-Berlin and DERI
> for passing through Silk to infer linkage to other resources.
>

 Ben and I have to present Open Biblio on Jan 17th so this sort of
downstream use is really valuable. I'll be working with ben to create a
poster on Open Bibliography tomorrow morning and this sounds like an
important thing to be able to put in - we should be able to think of this as
a Node in LOD

>
> Cheers,
> -w
>
> * [2011-01-10 02:47:22 +0000] Ben O'Steen <bosteen at gmail.com> écrit:
>
> ] IUCr data visualisation
> ]
> ] http://benosteen.com/timemap/index
> ]
> ] All the authors in the IUCr dataset have a rough address associated with
> ] them, and with a bit of tweaking and adjusting I've been able to get
> ] some semblance of matches for their lat,long locations.
> ]
> ] Of the 3774 unique address lines in the set, I've found something for
> ] 2796 of them - I'm sure with a few more passes across the data, we can
> ] improve that, but that proportion should be enough to start with.
> ]
> ] To visualise this, I'm using a handle bit of js that binds google maps
> ] and simile timeline - http://code.google.com/p/timemap/ To be specific,
> ] the functionality I'm using in the one demo'd in
> ] http://timemap.googlecode.com/svn/trunk/examples/progressive.html which
> ] is the progressive, on-demand loading of data from a date range.
> ]
> ] The data is sparql'd via a fast and loose SELECT and this forms the
> ] basis of the data sent to the js app. The address lookups are within a
> ] redis db, (addr-md5 hash -> string holding the lat long and type of
> ] match) Using the Redis pipeline feature makes it straightforward and
> ] responsive to lookup a number of these in one go and return them.
> ]
> ] NB the SPARQL is:
> ]
> ] PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> ] PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> ]
> ] SELECT DISTINCT ?c ?name ?address ?doi ?title ?date WHERE {
> ]  ?doi <http://purl.org/dc/elements/1.1/title> ?title .
> ]  ?doi <http://purl.org/dc/elements/1.1/date> ?date .
> ]  ?doi <http://purl.org/dc/terms/creator> ?c .
> ]  ?c <http://xmlns.com/foaf/0.1/name> ?name .
> ]  ?c <http://open.vocab.org/terms/recordedAddress> ?address .
> ]  FILTER(?date > "%s" && ?date < "%s")
> ] } LIMIT 400"""
> ]
> ] As Sparql is by far the slow point, we could optimise based on the fact
> ] that we could pre-generate the monthly sparql output and cache it no
> ] problem, but I think this is more generic as it stands.
> ]
> ] I have put the scripts, geocoded address cache and the pylon controller
> ] that provides the backend service into
> ] https://github.com/benosteen/IUCR-Geocoding - the only thing left out is
> ] the sparql command I used to pull all the addresses from the endpoint,
> ] but I think that's simple enough to ignore for now!
> ]
> ] It would be fantastic to add colour to the pegs in the map, each colour
> ] connected to a single paper in a given month - the pegs otherwise show
> ] authors and it is hard to see how spread the authorship of a given paper
> ] is.
> ]
> ] Ben
> ]
> ]
> ] _______________________________________________
> ] openbiblio-dev mailing list
> ] openbiblio-dev at lists.okfn.org
> ] http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>
> --
> William Waites                <mailto:ww at styx.org>
> http://eris.okfn.org/ww/         <sip:ww at styx.org <sip%3Aww at styx.org>>
> 9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664
>
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>



-- 
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openbiblio-dev/attachments/20110110/49173d79/attachment.html>


More information about the openbiblio-dev mailing list