[openbiblio-dev] Plotting timelines of the 'birth' of IUCr articles on a map

Mark MacGillivray mark at odaesa.com
Mon Jan 10 02:59:30 UTC 2011


Hi Ben, great stuff, really nice. i am sure we can put it to good use!

mark



On Mon, Jan 10, 2011 at 2:47 AM, Ben O'Steen <bosteen at gmail.com> wrote:
> IUCr data visualisation
>
> http://benosteen.com/timemap/index
>
> All the authors in the IUCr dataset have a rough address associated with
> them, and with a bit of tweaking and adjusting I've been able to get
> some semblance of matches for their lat,long locations.
>
> Of the 3774 unique address lines in the set, I've found something for
> 2796 of them - I'm sure with a few more passes across the data, we can
> improve that, but that proportion should be enough to start with.
>
> To visualise this, I'm using a handle bit of js that binds google maps
> and simile timeline - http://code.google.com/p/timemap/ To be specific,
> the functionality I'm using in the one demo'd in
> http://timemap.googlecode.com/svn/trunk/examples/progressive.html which
> is the progressive, on-demand loading of data from a date range.
>
> The data is sparql'd via a fast and loose SELECT and this forms the
> basis of the data sent to the js app. The address lookups are within a
> redis db, (addr-md5 hash -> string holding the lat long and type of
> match) Using the Redis pipeline feature makes it straightforward and
> responsive to lookup a number of these in one go and return them.
>
> NB the SPARQL is:
>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>
> SELECT DISTINCT ?c ?name ?address ?doi ?title ?date WHERE {
>  ?doi <http://purl.org/dc/elements/1.1/title> ?title .
>  ?doi <http://purl.org/dc/elements/1.1/date> ?date .
>  ?doi <http://purl.org/dc/terms/creator> ?c .
>  ?c <http://xmlns.com/foaf/0.1/name> ?name .
>  ?c <http://open.vocab.org/terms/recordedAddress> ?address .
>  FILTER(?date > "%s" && ?date < "%s")
> } LIMIT 400"""
>
> As Sparql is by far the slow point, we could optimise based on the fact
> that we could pre-generate the monthly sparql output and cache it no
> problem, but I think this is more generic as it stands.
>
> I have put the scripts, geocoded address cache and the pylon controller
> that provides the backend service into
> https://github.com/benosteen/IUCR-Geocoding - the only thing left out is
> the sparql command I used to pull all the addresses from the endpoint,
> but I think that's simple enough to ignore for now!
>
> It would be fantastic to add colour to the pegs in the map, each colour
> connected to a single paper in a given month - the pegs otherwise show
> authors and it is hard to see how spread the authorship of a given paper
> is.
>
> Ben
>
>
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>




More information about the openbiblio-dev mailing list