[okfn-labs] String Clustoring Fun

Tom Morris tfmorris at gmail.com
Sat May 11 17:02:02 UTC 2013


On Sat, May 11, 2013 at 12:08 PM, David Raznick <david.raznick at okfn.org>wrote:

>
> Google refine has some tools like this but they slow and does not really
> fit with my workflow.
>

Do you have performance figures on how your algorithm compares to the
various Refine algorithms?  We're always looking to improve.


> So my spare time has been dedicated to finding the quickest algorithms for
> string similarity clustering which bring back mostly "useful results".  If
> anyone is interested in the latest acedemic reaserch to how to do this fast
> then I am happy to bore you with it.
>

Rather than boring us, perhaps you could just provide a link to the
research bibliography along with your definition of "quickest" and "mostly
'useful results'".  That would help folks determine whether their metrics
are aligned with those that you used in the evaluation.


>  I have set up an very experimental endpoint with the fastest method
> researched.
>

That method being ... ?

Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130511/2b9719fc/attachment-0002.html>


More information about the okfn-labs mailing list