[okfn-labs] nomenklatura - thinking about naming

Rufus Pollock rufus.pollock at okfn.org
Wed Apr 24 09:18:39 UTC 2013


On 24 April 2013 09:01, Friedrich Lindenberg <friedrich at pudo.org> wrote:
> Hi all,
>
> I want to brush up the interface and docs for nomenklatura
> (http://nomenklatura.pudo.org/) at some point, and the hardest thing about
> this project is naming. Let me give you my one-sentence on what nomenklatura
> does:
>
> Nomenklatura is a data cleansing service that provides automated and manual
> options for merging multiple forms of a name into a canonical form.
>
> Example: when going through political databases, you may encounter not just
> "Angela Merkel", but also "Angela Merkel, CDU", "Angela Merkel, Chancellor",
> "Mrs. Angela Merkel", "MERKEL, Angela" etc. Nomenklatura does some basic
> normalisation and matching to solve the easy cases here, and then gives a
> nice UI to solve the harder merges by hand. In the end, you would have a
> single entry with a list of aliases.
>
> At the moment, the canonical form is called a "Value" in the domain model,
> while the aliases are called "Link". This has lead to confusion. I therefore
> want to rename the domain entities, so here's my questions:
>
> 1) What would people on this list call the canonical value (e.g. Entity,
> Lemma, ...)?

I'd go for entity.

> 2) What about the aliases (e.g. Alias, Link, SurfaceForm)?

I'm a +1 on either Alias (which I think is very clear - and better
than link). I also like Gregor's suggestion of

> 3) How would you pitch it?

Is reconciliation becoming a term of art here? i.e. Nomenklatura is
hosted Reconciliation.

Gregor's suggestion of Named Entity Normalization seems very good too
- says what it does on the tin.

Rufus

> Thanks for any help!
>
> - Friedrich
>
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>




More information about the okfn-labs mailing list