[open-humanities] The importance of search

Lars Aronsson lars at aronsson.se
Mon Feb 27 10:24:14 UTC 2012


On 02/24/2012 12:37 PM, Nick Stenning wrote:
> Here, the hidden heuristic is "search only documents written by
> Nietzsche" -- this would be trivial to implement manually, right? Just
> require the user to type in "author:Nietzsche". But a) for people with
> less unusual names, this doesn't uniquely identify them, potentially
> generating many spurious results, and b) this could be a simple "same
> author" checkbox. A simple heuristic that says "users frequently want
> to search works of the author they are currently reading" helps out a
> lot.

You also have the problem that you scanned 2 different editions
of Nietzsche's work A, 1 edition of work B, and 0 editions of work C.
Instead of the occurrences in A+B+C, you would get 2A+B.

But then again, maybe work A was twice as important, had twice the
impact, twice the number of readers, so 2A+B is more fair than A+B+C.
After all, the library whose books you scanned had 2 editions of A.


-- 
   Lars Aronsson (lars at aronsson.se)
   Aronsson Datateknik - http://aronsson.se

   Project Runeberg - free Nordic literature - http://runeberg.org/





More information about the open-humanities mailing list