[open-linguistics] Linguistic relevance

Kristian Kankainen kristian at eki.ee
Sun Jan 25 19:35:19 UTC 2015


Maybe one idea ... There is a DASISH report I read last year about 
quality metadata. It referred to another article[1] which states about 
metadata, that "we don't know when data is metadata or just data. [...] 
the usage turns it into metadata".
I haven't read the article they cite -- but I think it applies also to 
any linguistical relevance of data -- it is our usage (analysis) that 
turns it into linguistically relevant (knowledge). This "usage" aspect 
also captures your "/if they include incoming or outgoing links/ with at 
least one linguistic resource in a strict sense" point.

Does it make sense then, to say, that linguistic relevance in a strict 
sense refers to a quality of the data having a) explicit linguistic 
annotation (morph-analysis etc) b) implicit linguistic annotation 
(alignments in a wide sense etc) or c) being used to explain/describe a 
linguistic phenomena. The last point should also cover e.g training sets 
in NLP; but cover also linguistics in general.

Probably it doesn't give anything Jonathon Pool's comment didn't 
allready contain.

Best,
Kristian

[1] Bargmeyer, B., & Gillman, D. (2000). Metadata standards and metadata 
registries: An overview. Retrieved from 
http://stats.bls.gov/ore/pdf/st000010.pdf


23.01.2015 13:23, Christian Chiarcos kirjutas:
> Hi Kristian,
>
> actually, I meant that to be an "OR", precisely for the reason that an 
> associated publication would be too strict. However, an "associated 
> publication" may also be a paper using a resource provided by a third 
> party, so "non-academic" resources would be included as soon as 
> someone in the community refers to them.
>
> By the second criterion, I tried to include results of (unpublished) 
> master's theses, etc., or anything provided by companies or more 
> IT-/NLP-oriented colleagues. But again, this leaves room for 
> interpretation, so an alternative formulation would be better. Any idea?
>
> Best,
> Christian
>
>
>
> 2015-01-23 9:39 GMT+01:00 Kristian Kankainen <kristian at eki.ee 
> <mailto:kristian at eki.ee>>:
>
>     Hello!
>
>     Excuse my intrusion into the debate without introducing myself. As
>     I work at the Institute of Estonian Language, I feel included in
>     Christian's second point. But I want to argue against the
>     importance of having an associated publication.
>
>     I think there exists many datasets without a publication that can
>     be even more linguistically motivated than those having a
>     publication in accord. They often convey more pragmatic semantics
>     in a dictionary-like sense (thus exposing mainly "is_a" kind of
>     relations. This kind of datasets are often developed inside a
>     working group or individual person that might not match the
>     criteria of specialization in linguistics etc, but they are done
>     for solving the need of a "look-up function". This functionality
>     might very well be general enough to be used by others. I think
>     this "usability by others" factor could be said to convey a
>     linguistic relevance, if we look at them as linguistic signs as
>     agreed-upon but arbitrary :-).
>
>     Maybe I just got the logic wrong behind Christian's list: a
>     logical AND for the two points feels for me too strict. But also,
>     a logical OR feels too lax (for a country with 1.5 million people,
>     specialization is necessarily a shallower concept than in a big
>     country).
>
>     Best wishes
>     Kristian Kankainen
>
>
>     22.01.2015 14:37, Christian Chiarcos kirjutas:
>
>         What are your ideas about the following:
>         - having an associated publication at a linguistic or CL venue
>         (LSA, DGfS, ALT, ...; LREC, ACL, COLING, ...) or in a
>         corresponding journal or series (LREJ, TACL, ...), or
>         - being developed at a (university or company) department or
>         by an individual specialized in linguistics, philology,
>         lexicography, natural language processing, or localization.
>
>
>     _______________________________________________
>     open-linguistics mailing list
>     open-linguistics at lists.okfn.org
>     <mailto:open-linguistics at lists.okfn.org>
>     https://lists.okfn.org/mailman/listinfo/open-linguistics
>     Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics
>
>
>
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/options/open-linguistics

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20150125/04b502cf/attachment-0003.html>


More information about the open-linguistics mailing list