[okfn-br] WEB DATA CLEANSING, é o que a ACM diz que fazemos na OKBr!
Peter Krauss
ppkrauss em gmail.com
Sexta Julho 3 20:07:20 UTC 2015
OKBr's,
Muita agente aqui da OKBr trabalhou duro com os bits: costumamos chamar
esse trabalho típico de "data scraping
<https://en.wikipedia.org/wiki/Data_scraping>"... E pelo que entendi está
vigorando agora um outro termo, que complementa a descrição dessas
empreitadas, é o "*data cleasing*
<https://en.wikipedia.org/wiki/Data_cleansing>".
Estou anexando (e-mail Fw) um chamado para submissões na famosa ACM, que
provavelmente ajudará a sacramentar o uso do termo *"web data cleansing"*
-- ao buscar verificamos que existem já empresas se especializando nisso
(!), provavelmente para concorrer com a OKFN :-)
O chamado também inclui oportunidade para o pessoal de Web Semântica... No
geral me parece tudo bem correlato com o que a OKFN e OKBr fazem!
- - - -
Mudei o titulo do *"**Fwd: CfP: ACM Journal of Data and Information Quality
(JDIQ): Special Issue on Web Data Quality"*, pois a ACM não é exatamente
OpenAccess <https://en.wikipedia.org/wiki/OpenAccess>, de modo que alguns
poderiam me tachar de garoto-propaganda ;-)
Nosso consultor científico aqui da OKBr, de qualquer forma, já comentou que
a ACM, como *publisher*, "atualmente tem uma opção mai-ou-menos-open-access
".
---------- Forwarded message ----------
From: Christian Bizer <chris em bizer.de>
Date: 2015-07-02 4:58 GMT-03:00
Subject: CfP: ACM Journal of Data and Information Quality (JDIQ): Special
Issue on Web Data Quality
To: public-lod em w3.org, semantic-web em w3.org, public-vocabs em w3.org,
dbpedia-discussion em lists.sourceforge.net, semanticweb em yahoogroups.com
Hi all,
we are happy to announce that the ACM Journal of Data and Information
Quality (JDIQ) will feature a special issue on Web Data Quality.
The goal of the special issue is to present innovative research in the
areas of Web Data Quality Assessment and Web Data Cleansing.
The submission deadline for the special issue is November 1st, 2015.
Please find the detailed call for papers below and at
http://jdiq.acm.org/announcements.cfm#special-issue-of-acm-jdiq-on-web-data-quality
Best,
Luna Dong, Ihab Ilyas, Maria-Esther Vidal, and Christian Bizer
---------------------
Call for Papers:
ACM Journal of Data and Information Quality (JDIQ)
Special Issue on Web Data Quality
---------------------
Guest editors:
* Christian Bizer, University of Mannheim, Germany,
chris em informatik.uni-mannheim.de
* Luna Dong, Google, USA, lunadong em google.com
* Ihab Ilyas, University of Waterloo, Canada, ilyas em uwaterloo.ca
* Maria-Esther Vidal, Universidad Simon Bolivar, Venezuela,
mvidal em umiacs.umd.edu
Introduction:
The volume and variety of data that is available on the web has risen
sharply. In addition to traditional data sources and formats such as CSV
files, HTML tables and deep web query interfaces, new techniques such as
Microdata, RDFa, Microformats and Linked Data have found wide adoption. In
parallel, techniques for extracting structured data from web text and
semi-structured web content have matured resulting in the creation of
large-scale knowledge bases such as NELL, YAGO, DBpedia, and the Knowledge
Vault.
Independent of the specific data source or format or information extraction
methodology, data quality challenges persist in the context of the web.
Applications are confronted with heterogeneous data from a large number of
independent data sources while metadata is sparse and of mixed quality. In
order to utilize the data, applications must first deal with this widely
varying quality of the available data and metadata.
Topics:
The goal of this special issue of JDIQ is to present innovative research in
the areas of Web Data Quality Assessment and Web Data Cleansing. Specific
topics within the scope of the call include, but are not limited to, the
following:
WEB DATA QUALITY ASSESSMENT:
* Metrics and methods for assessing the quality of web data, including
Linked Data, Microdata, RDFa, Microformats and tabular data.
* Methods for uncovering distorted and biased data / data SPAM detection.
* Methods for quality-based web data source selection.
* Methods for copy detection.
* Methods for assessing the quality of instance- and schema-level links
Linked Data.
* Ontologies and controlled vocabularies for describing the quality of web
data sources and metadata.
* Best practices for metadata provision.
* Cost and benefits of web data quality assessment and benchmarks.
WEB DATA CLEANSING:
* Methods for cleansing Web data, Linked Data, Microdata, RDFa,
Microformats and tabular data.
* Conflict resolution using semantic knowledge and truth discovery.
* Human-in-the-loop and crowdsourcing for data cleansing.
* Data quality for automated knowledge base construction.
* Empirical evaluation of scalability and performance of data cleansing
methods and benchmarks.
APPLICATIONS AND USE CASES IN THE LIFE SCIENCES, HEALTHCARE, MEDIA, SOCIAL
MEDIA, GOVERNMENT AND SENSOR DATA.
Important dates:
Initial submission: November 1, 2015
First review: January 15, 2016
Revised manuscripts: February 15, 2016
Second review: March 30, 2016
Publication: May 2016
Submission guidelines:
http://jdiq.acm.org/authors.cfm
--
Prof. Dr. Christian Bizer
Data and Web Science Group
University of Mannheim, Germany
chris em informatik.uni-mannheim.de
http://dws.informatik.uni-mannheim.de/bizer
-------------- Próxima Parte ----------
Um anexo em HTML foi limpo...
URL: <http://lists.okfn.org/pipermail/okfn-br/attachments/20150703/dbb465ec/attachment-0004.html>
Mais detalhes sobre a lista de discussão okfn-br