[okfn-discuss] open film metadata database

Rufus Pollock rufus.pollock at okfn.org
Thu Apr 13 13:30:45 UTC 2006

Saul Albert wrote:
> On Mon, Apr 10, 2006 at 05:32:26PM +0100, Rufus Pollock wrote:
>>Does anyone know whether there is an 'open' source of film metadata. 
>>There have been several projects for music metadata including freedb 
>>(GPL) and musicbrainz  (by-nc-sa) but I don't know of one for films/dvds 
>>(there's IMDB of course but that doesn't seem very open_[1][2][3]).
>>Not only would this kind of db be a useful set of open knowledge but 
>>with the increasing sales of DVDs and ever cheaper hard disk space 
>>people are surely going to start building up dvd collections on their 
>>computers in the way they build up their CD collection.
> Hi Rufus,
> I don't think there is such a thing, as far as I know - although there
> are some cataloguing systems for people's off-line content (books, dvds,
> cds etc) that nab that kind of information from screen scraping imdb or

imdb seem pretty decent about giving out their data:


And several people have built interfaces:


However it is clear that it is not open:

'Please refer to the  copyright/license information listed in each file 
for instructions on allowed usage. The data is NOT FREE  although it may 
be used for free in specific circumstances.'

> using amazon reseller APIs like http://opendb.iamvegan.net/ and of
> course, the now-more-or-less-defunct http://dlp.theps.net. When I was
> working on this, I talked to jo and other uo librarians about doing a
> kind of free isbn brokerage (several of these exist - http://isbndb.com/
> for example) based on that to fill in the gaps of isbndb:

openbiblio! This is something i find really annoying for research work 
is that there is still not a nice, open, available db of biblographic 
info with a decent interface so that when I want a reference my editor 
can chugg off an get it. There are plenty of standards and standalone 
software solutions but no nice freedb-like db as far as I can see:


> The problem with the ISBN system is that it doesn't deal well with
> pre-isbn books (which only came about in the 70's), and it's kind of
> difficult dealing with different editions, regions, publishers etc.  etc.
> for what is essentially the same book. 

sounds perfect for a musicbrainz/freedb type solution? Particularly if 
we could hook up to a few OAI-PMH interfaces to get data from libraries. 
This ties heavily into the work we are doing on public domain burn:


> I don't think that DVDs necessarily have the same problem - but as far as
> I know, CD / music dbs and clients scan track listings and timings to
> correlate with track listings already on the database, then apply id3 tag
> metadata to encoded media. I have no idea whether such easy-to-search on
> metadata can be extracted from DVDs, and standard tag systems applied - I
> imagine so, but I've not heard of such a thing or done the research.

Interesting question. I imagine one current obstacle is that is much 
harder to 'rip' DVDs -- the TPMs make it harder (and of course, just 
like ripping CDs, even for personal use it is illegal). However given 
that the average DVD has ~20 scenes I can't imagine that it would be 
harder to id than a CD.

> Something like you're proposing does seem timely, given that sites like
> http://green.tv (an actual, real-life product of the espians!) are
> distributing video content and podcasts in multiple formats, then
> watching them on their ipods etc..  more people will probably start
> ripping and encoding their own dvds for playback on mobile devices.
> I have been talking to tav (leader of the espians) about his plans for
> the technology underlying green.tv (based on the 'protoplex') and a lot
> of nice ajaxy stuff... I think they're planning some really interesting
> developments along the lines of splicing and editing online media and
> providing people with tools for high quality, multiple-format video
> encoding and distribution in such a system.

This is very interesting. You probably already know about:


and associated projects which is being run (by among others) Adnan Hadzi 
who took part in nodel and is a member of Free Culture UK.

> I'm personally more interested in applying metadata to video / media
> content to make it searchable by timeline... so I think there's a lot of
> gaps in the video/dvd/self-encoded media market that form a whole of
> some sort.

Excellent point. This relates to ideas about knowledge packaging i've 
been mulling over for a while and about which i've been meaning 
post/blog but haven't because I don't yet feel i've got it correctly 
sorted out in my mind :( (it ties many of the different strands 
including  http://www.okfn.org/why_the_okf.html). Probably I just get 
something down and go from there.

> I dunno who might be interested in or able to do something like this
> though. There must be a huge pile of prior art. I reckon what's needed
> first is a research project to gather who the brains are in the area to
> find out what is really missing..

It is the kind of thing where something /must/ have been done but if so 
I haven't stumbled across it yet. If anyone finds anything please post 
it to the list.


More information about the okfn-discuss mailing list