[open-linguistics] open-linguistics Digest, Vol 45, Issue 9

Bettina Klimek bettinak86 at gmail.com
Tue Jul 29 10:42:07 UTC 2014


Re:Message: Updated LOD Cloud Diagram - Please
        enter your linked datasets into the datahub.io catalog for
inclusion.


Dear all,

this is to let you know that I updated all 65 datasets at
http://datahub.io/organization/owlg so that each dataset is tagged now with
"lod" and "llod". Now we are one step closer to the addition/update of
these datasets into the LOD cloud and to assigning them a better category
than "cross-domain".
Please, have a look at the datasets and let me know if there are any
further issues.

I found one dataset called "Cosmetic Surgeon Wearing Nursing Scrubs,
Nursing Uniforms, Expert Scrubs For Safety" which I assume to be spam
(hence tagged with "spam"). Unless nobody confirms this dataset as being a
linguistic dataset within the following two weeks, I will delete it.

All the best,
Bettina


2014-07-24 16:48 GMT+02:00 <open-linguistics-request at lists.okfn.org>:

> Send open-linguistics mailing list submissions to
>         open-linguistics at lists.okfn.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.okfn.org/mailman/listinfo/open-linguistics
> or, via email, send a message with subject or body 'help' to
>         open-linguistics-request at lists.okfn.org
>
> You can reach the person managing the list at
>         open-linguistics-owner at lists.okfn.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of open-linguistics digest..."
>
>
> Today's Topics:
>
>    1. final CFP: EMNLP workshop on Taxonomy Extraction with
>       Applications in Semantics (TEXAS) (Paul Buitelaar)
>    2. Fwd: Updated LOD Cloud Diagram - Please enter your linked
>       datasets into the datahub.io catalog for inclusion.
>       (Christian Chiarcos)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Thu, 24 Jul 2014 15:37:43 +0100
> From: Paul Buitelaar <paul.buitelaar at deri.org>
> To: <corpora at uib.no>, <semantic-web at w3.org>,
>         <open-linguistics at lists.okfn.org>, <public-lod at w3.org>,
>         <sigsem at aclweb.org>, <planetkr at kr.org>
> Cc: Roberto Navigli <navigli at di.uniroma1.it>, "Bordea, Georgeta"
>         <georgeta.bordea at deri.org>, Stefano Faralli <
> faralli at di.uniroma1.it>
> Subject: [open-linguistics] final CFP: EMNLP workshop on Taxonomy
>         Extraction with Applications in Semantics (TEXAS)
> Message-ID: <53D11A37.9070408 at deri.org>
> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
> Taxonomy Extraction with Applications in Semantics (TEXAS)
> http://emnlp2014.org/workshops/TEXAS/call.html
>
> At EMNLP 2014, 29 October 2014, Doha, Qatar
>
> ** Submission deadline: July 26, 2014 **
>
> Taxonomies form the backbone of knowledge-based systems by organizing
> knowledge in a machine interpretable manner and facilitating information
> integration. Hierarchical structures provide valuable input in
> knowledge-intensive applications such as question answering and textual
> entailment and are useful tools for browsing and navigation of document
> collections, especially when applied for exploration and discovery.
>
> The TEXAS workshop aims to provide a venue for presenting and discussing
> approaches that evaluate taxonomy extraction, and its subtasks
> (term/concept extraction, term/concept relation discovery, taxonomy
> construction and cleaning) in the context of semantic applications such
> as: entity search, entity disambiguation and linking, information
> integration and summarization, knowledge acquisition, knowledge sharing,
> inference in NLP tasks (question answering, textual entailment), etc. In
> this way, progress towards automatically constructed hierarchies can be
> measured relative to other tasks and real-world applications.
>
> Expected research topics of relevance to the workshop:
>
>   * application-based evaluation of taxonomies in question answering,
>     document browsing,document clustering, expert finding or other
>     applications;
>   * using automatically constructed taxonomies for searching, browsing
>     and organizing information
>   * constructing taxonomies for/from social media
>   * probabilistic models for topic hierarchies (hierarchical topic
>     modelling)
>   * constructing taxonomies using hierarchical clustering
>   * using distributional models for taxonomy construction
>   * acquisition and modelling of categorical structure and modelling
>     human category acquisition
>   * constructing topic categorization systems and subject hierarchies
>   * constructing hierarchical faceted metadata structures
>   * methods for transforming semi-structured knowledge resources into
>     taxonomies
>   * merging and aligning existing resources for taxonomy construction
>   * comparing, aligning and evaluating existing hierarchical structures
>   * domain glossary acquisition and extracting taxonomies from definitions
>   * constructing application/domain specific taxonomies from existing
>     resources (lexical resources,Linked Open Data, Wikipedia category
>     structure, semantic networks)
>   * using different hierarchical structures (e.g., tree, DAG) and
>     relation types (e.g., hyponymy, meronymy) for taxonomy construction
>   * attaching Named Entities to hierarchical structures and using Named
>     Entities to drive taxonomy construction by extensional analysis
>   * multilinguality and taxonomies: constructing and using multilingual
>     taxonomies
>
>
> --------------------------------------------------------------------------------
> Paper Submissions
>
> Submissions should be made electronically, using Softconf at
> https://www.softconf.com/emnlp2014/texas2014/.
> Submissions should follow the two-column format of ACL 2014 proceedings
> and should not exceed 8 pages of content and one additional references
> page. The LaTeX style files and the Microsoft Word style files tailored
> for this year's conference are available at:
> http://emnlp2014.org/call.html.
>
> The reviewing of papers will be double-blind, so please make sure your
> paper shows the title, but no author information. You should likewise
> not have any self identifying references anywhere in the paper submitted
> for review. For example, rather than this: "We showed previously (Smith,
> 2001), ...", use citations such as: "Smith (2001) previously showed
> ...". References to your own work in thesis proposals should also be
> anonymized. You may for example write it as "in X (2000) we showed",
> etc. and do not add your papers in the reference list.
>
> Important Dates
> - Paper submission: July 26, 2014
> - Paper notification: August 26, 2014
> - Camera ready: September 15
> - Workshop: October 29, 2014
>
> Further information: http://emnlp2014.org/workshops/TEXAS/call.html
>
> Workshop Organisers:
> Georgeta Bordea - Unit for Natural Language Processing, Insight,
> National University of Ireland, Galway
> Paul Buitelaar - Unit for Natural Language Processing, Insight, National
> University of Ireland, Galway
> Stefano Faralli - Linguistic Computing Laboratory, Dept. of Computer
> Science, Sapienza University of Rome, Italy
> Roberto Navigli - Linguistic Computing Laboratory, Dept. of Computer
> Science, Sapienza University of Rome, Italy
>
> The TEXAS workshop is supported by the following projects: "MultiJEDI"
> ERC Starting Grant (http://multijedi.org/), lead by Prof. Roberto
> Navigli at the Linguistic Computing Laboratory of the Sapienza
> University of Rome, Italy; Linked Data and Text Mining research area
> (http://nlp.deri.ie/), lead by Dr. Paul Buitelaar at INSIGHT
> (http://www.insight-centre.org/), the Irish Centre for Data Analytics,
> National University of Ireland, Galway.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.okfn.org/pipermail/open-linguistics/attachments/20140724/70557e73/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Thu, 24 Jul 2014 16:48:51 +0200
> From: Christian Chiarcos <christian.chiarcos at web.de>
> To: "open-linguistics at lists.okfn.org"
>         <open-linguistics at lists.okfn.org>
> Subject: [open-linguistics] Fwd: Updated LOD Cloud Diagram - Please
>         enter your linked datasets into the datahub.io catalog for
> inclusion.
> Message-ID:
>         <CAC1YGdimSe623Kc8rosEP5J5TSYu1XghNeF3h3MKm1JaMY6=
> gA at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Dear all,
>
> those of you also active in the Semantic Web community will have already
> have encountered Chris Bizer's mail below. However, as this doesn't include
> everybody in the group, please forgive the redundancy: In the context of
> compiling the new LOD diagram, it would be great to include LLOD data sets
> as well, but for most of our data sets, this requires updating the
> datahub.io entries according to the specifications under
>
> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
> .
>
>
> Also, as the number of LLOD resources has been grown rapidly in the last
> years, I asked him whether there would be room to create a distinct
> category for them in the LOD diagram. Except for the light green data sets
> in the diagram which are actually taken from the LOD diagram (DBpedia
> etc.), other LLOD in the LOD cloud datasets are currently mostly labeled as
> "cross-domain" resources -- somewhat dissatisfying to most of us, I guess.
>
> So, this is a *call for action*: If you want your resources to get in,
> please update your metadata, send me an email until end of next week, and I
> will forward this as an update request to Chris.
>
> Best,
> Christian
>
> ---------- Forwarded message ----------
> From: Christian Bizer <chris at bizer.de>
> Date: 2014-07-24 14:18 GMT+02:00
> Subject: Updated LOD Cloud Diagram - Please enter your linked datasets into
> the datahub.io catalog for inclusion.
> To: public-lod at w3.org
>
>
> Hi all,
>
>
>
> Max Schmachtenberg, Heiko Paulheim and I have crawled of the Web of Linked
> Data and have drawn an updated LOD Cloud diagram based on the results of
> the crawl.
>
>
>
> This diagram showing all linked datasets that our crawler managed to
> discover in April 2014 is found here:
>
>
>
>
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/LODCloudDiagram.png
>
>
>
> We also analyzed the compliance of the different datasets with the Linked
> Data best practices and a paper presenting the results of the analysis is
> found below. The paper will appear at ISWC 2014 in the Replication,
> Benchmark, Data and Software Track.
>
>
>
>
> http://dws.informatik.uni-mannheim.de/fileadmin/lehrstuehle/ki/pub/SchmachtenbergBizerPaulheim-AdoptionOfLinkedDataBestPractices.pdf
>
>
>
> The raw data used for our analysis is found on this page:
>
>
>
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/
>
>
>
> Our crawler did discover 77 dataset that do not allow crawling via their
> robots.txt files and these datasets were not included into our analysis and
> are also not included in the current version of the LOD Cloud diagram.
>
>
>
> A list of these datasets is found at
>
> http://data.dws.informatik.uni-mannheim.de/lodcloud/2014/ISWC-RDB/tables/notCrawlableDatasets.tsv
>
>
>
> In order to give a comprehensive overview of all Linked Data sets that are
> currently online, we would like to draw another version of the LOD Cloud
> diagram including the datasets that our crawler has missed as well as the
> datasets that do not allow crawling.
>
>
>
> Thus, if you publish or know about linked datasets that are not in the
> diagram or in the list of not crawlable datasets yet, please:
>
>
>
> 1.       Enter them into the datahub.io data catalog until August 8th.
>
> 2.       Tag them in the catalog with the tag ?lod? (
> http://datahub.io/dataset?tags=lod)
>
> 3.       Send an email to Max and Chris pointing us at the entry in the
> catalog.
>
>
>
> We will include all datasets into the updated version of the cloud diagram,
> that fulfill the following requirements:
>
>
>
> 1.       Data items are accessible via dereferencable URIs.
>
> 2.       The dataset sets at least 50 RDF links pointing at other datasets
> or at least one other dataset is setting 50 RDF links pointing at your
> dataset.
>
>
>
> Instructions on how to describe your dataset in the catalog are found here:
>
>
>
>
> https://www.w3.org/wiki/TaskForces/CommunityProjects/LinkingOpenData/DataSets/CKANmetainformation
>
>
>
> Please make sure that you include information about the RDF links pointing
> from your dataset into other datasets (field links: ) as well as a tag
> indicating the topical category of your dataset, so that we know how to
> include it into the diagram.
>
> Please also include an example URI from your dataset into the catalog.
>
>
>
> We will start to review the new datasets and to draw the updated version of
> the LOD cloud diagram after August 8th.
>
> So please point us at datasets to be included before this date.
>
>
>
> Cheers,
>
>
>
> Max, Heiko, and Chris
>
>
>
>
>
> --
>
> Prof. Dr. Christian Bizer
>
> Data and Web Science Research Group
>
> Universit?t Mannheim, Germany
> chris at informatik.uni-mannheim.de
>
> www.bizer.de
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> http://lists.okfn.org/pipermail/open-linguistics/attachments/20140724/fcbd95cd/attachment.html
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> open-linguistics mailing list
> open-linguistics at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/open-linguistics
> Unsubscribe: https://lists.okfn.org/mailman/optionss/open-linguistics
>
>
> ------------------------------
>
> End of open-linguistics Digest, Vol 45, Issue 9
> ***********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-linguistics/attachments/20140729/b7d8923b/attachment-0002.html>


More information about the open-linguistics mailing list