[pdb-discuss] Questions Questions Questions.. ooh and hello :)

Ian Ibbotson ian.ibbotson at k-int.com
Thu Mar 15 12:20:12 UTC 2007


<Unlurk>

Hey All, apologies for busting in on the discussion.... I've been
following your work for a little while now, and I'm really interested in
whats going on, and the data (Well, OK, especially the data)). I'd love
to contribute, but alas it's hard to teach an old dog new tricks and I'm
really a java bod rather than perl or python person.

I really don't want to tread on any toes or anything, but I've been
mucking around and got some java code that does some of what you're
discussing here. If you want to keep the community focused on one
implementation, I'll just look after my code and keep it as a play
thing, but if it's more the case that okfn/pdb would like to "Let a
thousand flowers bloom" (as it were) then I'd be more than happy to
share what I've tentatively tagged jpdb (A java version of the pdb).
Right now, I'm just parsing the composers file and buggering about with
the web services. But before long I should be able to make the thing
searchable via SRW/SRU.

What I really wanted to ask tho.... was the composers file.. I've had a
dig around and I can't actually see a concrete definition for it.. Right
now.. I'm using something like this as a spec:

ComposerNameEntry ::= [Title] Name DOB DOD
Title ::= String
Name ::= NameComponents AdditionalNameSpec
NameComponents ::= NameComponent +
NameComponent :: String ( / AlternateSpellingString )+
AdditionalNameSpec ::= ([&]ps: Name [,Name]+ )
DOB ::= * Date
DOD ::= + Date
Date ::= Year [Month [Day]] [(Date Comment)]

Does that pretty much tie up with the general understanding? I wasn't
sure what the difference between ps: and &ps: is... any thoughts?

Cheers for your time.

Ian. 


On Sun, 2007-03-11 at 14:05 +0000, Rufus Pollock wrote:
> Nathan Lewis wrote:
> > 
> > Hi Rufus,
> > 
> > Have you done any of these things so far?
> 
> Yes.
> 
> 1. Parser: I have just (re-)written a parser for the composer data in 
> python:
> 
> <http://project.knowledgeforge.net/pdw/svn/trunk/src/pdw/parse_composer_data.py>
> <http://project.knowledgeforge.net/pdw/svn/trunk/src/pdw/parse_composer_data_test.py>
> 
> (I know we already had your perl parseComp.pl but my perl skills are 
> non-existent and the new code (i think) adds some extra functionality 
> like trying to standardize the date format, extra aliases wherever 
> possible etc. Full details in the commit messages which I've included at 
> the end of the email).
> 
> 2. Musicbrainz interface: comments below
> 
> > I have had a play with the musicbrainz client but I found that the 
> > deprecated rdf based perl binding does not appear to work very well and 
> > the new ReST based perl binding is very incomplete. How are the python 
> > bindings?
> 
> Python bindings seem to work fine. As I posted a couple of weeks back I 
> got something simple working which I put in subversion:
> 
> http://project.knowledgeforge.net/pdw/svn/trunk/src/pdw/mb.py
> 
> Demoed (very simply) in:
> 
> http://project.knowledgeforge.net/pdw/svn/trunk/src/pdw/mb_test.py
> 
> > If neither are up to the task then we will have to write code to use the 
> > ReST api directly which wouldn't be too difficult but we will still need 
> > some tricky heuristics to work out which artist to go with when more 
> > than one match the name.
> 
> As you said we will need some heuristics but do not think it will be 
> that hard ...
> 
> > I think the musicbrainz cross referencing might be too much to get 
> > working in one week.
> 
> Sure but we can try.
> 
> Regards,
> 
> Rufus
> 
> ## Log Message
> 
> ### data/composers.txt:
> 
> r29 | rgrp | 2007-03-11 13:50:18 +0000 (Sun, 11 Mar 2007) | 35 lines
> 
> Make various changes to source composer data file in order to make it 
> easier to parse (and to fix some bugs in the data). More work is needed 
> but this is a start. Justification of the changes provided below.
> 
> 1. Rudolf FRIML:
> 
> -(Karl) Rudolf FRIML, * 1879 or 1884 Dec 7, + 1972 Nov 12
> +(Karl) Rudolf FRIML, * 1879 Dec 7, + 1972 Nov 12
> 
> Checking on the internet (wikipedia and elsewhere) indicated that 1879 
> was correct year of birth.
> 
> 2. Chris PATTON:
> 
> -Chris(=Christopher W) PATTON, * @57, + 2006 Apr 25
> +Chris(=Christopher W) PATTON, * 1957, + 2006 Apr 25
> 
> Assume @57 is a simple typo (internet searching did not give any indicators)
> 
> 3. Tennyson JESSE:
> 
> -Fryniwyd(=Wynifried Margaret?) Tennyson JESSE, Mrs HARWOOD, * 1888 or 
> 1889, + 1958 Aug 6
> +Fryniwyd(=Wynifried Margaret?) Tennyson JESSE, Mrs HARWOOD, * 1888, + 
> 1958 Aug 6
> 
> Internet research (e.g. 
> http://www.classiccrimefiction.com/f-tennyson-jesse.htm) indicates 1888 
> as correct year of birth.
> 
> 4. Eduardo SANCHEZ De FUENTES y PELAEZ
> 
> -Eduardo SANCHEZ De FUENTES y PELAEZ, * 1876 Apr 3, * 1944 Sep 7, + ?
> +Eduardo SANCHEZ De FUENTES y PELAEZ, * 1876 Apr 3, + 1944 Sep 7
> 
> Original entry looks like a typo (and brief bit of internet browsing 
> seemed to suggest he had lived in early 20th century).
> 
> 5. Various others (see diff): fix aliases so the parser works
> 
> The irregular structure has already caused quite a few problems 
> (sometimes aliases are separated by commas sometimes they are bracketed 
> etc etc). Here just fixed cases like RUSSELL(-BROWN) by removing 
> bracketed section and creating an new alias with the stuff in the 
> brackets unbracketed.
> 
> 
> ### src/pdw/parse_composer_data.py
> 
> r30 | rgrp | 2007-03-11 13:52:17 +0000 (Sun, 11 Mar 2007) | 9 lines
> 
> Code to parse composers.txt data file on composer birth and death dates 
> into a usable form.
> * trunk/src/pdw/parse_composer_data.py,
>    trunk/src/pdw/parse_composer_data_test.py:
>    Create a ComposerFileParser class which parses out each line in the 
> composers.txt file return a dictionary containing (among others):
>      * last name
>      * first name
>      * birth date
>      * death date
> 
> _______________________________________________
> pdb-discuss mailing list
> pdb-discuss at lists.okfn.org
> http://lists.okfn.org/cgi-bin/mailman/listinfo/pdb-discuss





More information about the pd-discuss mailing list