[open-bibliography] Disambiguation, deduplication and 'ideals'
    Ben O'Steen 
    bosteen at gmail.com
       
    Wed Sep  1 08:19:28 UTC 2010
    
    
  
On Wed, 2010-09-01 at 05:08 +0200, Thomas Krichel wrote:
> Karen Coyle writes
> 
> > As you can see, the questions go on and on!
>  
>   Deduplication is also service context dependent. ...
I absolutely agree and I'll also say that when you are de-duplicating
for any of these reasons, you will be using some probabilistic method of
some kind, 99% of the time ;) Whether it's a fellegi-sunter based whole
record dedupe, or single field (eg id) matching, there will be false
positives and false negatives. 
Your success rate will always be <100%, and the degree of success will
vary depending on who and for what purpose this was done.
Ben
    
    
More information about the open-bibliography
mailing list