[ok-edinburgh] Open Knowledge meeting
Jo Walsh
jo at frot.org
Thu Feb 25 13:06:18 UTC 2010
dear Bonnie, thanks for this.
On 25/02/2010 13:20, Bonnie Webber wrote:
> Robin/Jo - Is this too far-fetched a link to the content
> of your Open Knowlege meeting?
>
> http://www.nature.com/news/2009/090824/full/news.2009.857.html
>
> The robots that people are assuming will automatically
> annotate and enrich documents created in Google Wave can
> only work if the databases and texts they need to crawl
> are themselves open.
Robin mentioned Peter Murray-Rust, and I was thinking of him too.
He did a talk at a workshop on Text Mining applications in Manchester
last year on just this subject. A memorable line:
"My bots know a lot about chemistry, but nothing about copyright".
He challenged the speaker from Elsevier to commit to making currently
"free" text that is not open for re-use, available for use by automated
natural language processing tools, but the speaker could not commit.
Similar situation with OCLC, their terms of use on WorldCat expressly
prohibit any automated crawling and parsing of the bibliographic
metadata and fulltext papers, even for a pure research application.
There's been a lot of sponsorship of text mining / enhancement
techniques in the physical sciences, particularly in chemistry and
bioinformatics, where there's lots of consistent vocabulary, potential
for serendipitous findings, generally low-hanging fruit.
The same techniques could work in law, archaeology, social sciences -
there doesn't seem to be the same level of support, but researchers
scratching their itches could prove viability for future investment - if
the corpora they were working with had no re-use restrictions.
cheers,
jo
--
More information about the ok-scotland
mailing list