[open-bibliography] OCLC's license for FAST

Tom Morris tfmorris at gmail.com
Thu Jan 5 00:12:27 UTC 2012


On Wed, Jan 4, 2012 at 6:25 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:
> Quoting Tom Morris <tfmorris at gmail.com>:
>
>
>> I think the decomposed structure is more user-friendly and useful, but
>> scraping the OCLC site strikes me as an unnecessary level of
>> indirection.  You're still at the mercy of OCLC's proprietary
>> processing script and hosting.  If they decide to pull the plug,
>> you're left out in the cold.
>
>
> Here's the link to the FAST converter. It's not quite as simple as just
> breaking apart the headings at the dash-dash points:
>
> http://www.oclc.org/research/activities/fastconverter/default.htm
>
> The advantage of using the OCLC version is that it will be kept up to date.
> I don't know what the converter, run periodically, would result in, but
> perhaps someone can reason that out from the code?

Thanks for the link Karen.  Unfortunately, that's a) just a web
interface to the converter, not the converter itself and b) licensed
under OCLC's restrictive license, not an open source license, so it
wouldn't be reusable even if the actual software were there.

I reviewed some of the presentations and papers available and it's
apparent that the algorithms used have evolved over time and that they
would be non-trivial to reproduce, at least as described.  Having said
that, it's not clear that a simpler implementation wouldn't be just as
easy and effective for users to deal with.

For the short term, scraping the publicly available data is the best
stop gap.  Longer term, the Library of Congress should be encouraged
to either get the OCLC to release the current code or to develop an
open source replacement for it.  Anyone talking to the Library of
Congress should be continually reinforcing that it is unacceptable to
use tax dollars to support the OCLC in maintaining their monopolistic
practices.

Tom




More information about the open-bibliography mailing list