[ciência aberta] NATURE - Tensions grow as data-mining discussions fall apart

Segunda Junho 17 17:46:24 UTC 2013

http://www.nature.com/news/tensions-grow-as-data-mining-discussions-fall-apart-1.13130
Tensions grow as data-mining discussions fall apart

Scientists want to exempt computer-based text crawling from Europe’s
copyright law.

   - Richard Van
Noorden<http://www.nature.com/news/tensions-grow-as-data-mining-discussions-fall-apart-1.13130#auth-1>

04 June 2013
 Article tools

   - print
   - email<http://www.nature.com/news/foxtrot/svc/mailform?doi=10.1038/498014a&file=/news/tensions-grow-as-data-mining-discussions-fall-apart-1.13130>
   - download pdf<http://www.nature.com/polopoly_fs/1.13130!/menu/main/topColumns/topLeftColumn/pdf/498014a.pdf>
   - rights & permissions<https://s100.copyright.com/AppDispatchServlet?author=Richard+Van+Noorden&title=Tensions+grow+as+data-mining+discussions+fall+apart&publisherName=NPG&contentID=10.1038%2F498014a&publicationDate=06%2F04%2F2013&publication=Nature+News>
   - share/bookmark

Disagreement between scientists and publishers has grown on a thorny issue:
how to make it easier for computer programs to extract facts and data from
online research papers. On 22 May, researchers, librarians and others
pulled out of European Commission talks on how to encourage the techniques,
known as text mining and data mining. The withdrawal has effectively ended
the contentious discussions, although a formal abandonment can be decided
only after a commission review in July.

Scientists have chafed for years at limitations on computer-aided research.
They would like to use computer programs to crawl over thousands or
millions of articles and other online research content, extracting data to
build up databases or to pick out patterns such as associations between
genes and diseases.

But in many parts of the world, including Europe, this sort of use
currently requires permission from the content’s copyright owner. Even if
an institution has paid to access a journal, its academics do not
necessarily have permission to mine the text. Publishers, worried that
their content might be redistributed for free, tend to block data-mining
programs, giving extra licence permissions only on a slow, case-by-case
basis (see *Nature* *483,*134–135;
2012<http://www.nature.com/uidfinder/10.1038/483134a>).
And although authors can now choose to publish under licences that
explicitly allow text mining, that innovation doesn’t help text-miners
wanting to run programs on decades of pre-existing content.
 Related stories

   - Text-mining spat heats up<http://www.nature.com/doifinder/10.1038/495295a>
   - Gold in the text? <http://www.nature.com/doifinder/10.1038/483124a>
   - Trouble at the text mine<http://www.nature.com/doifinder/10.1038/483134a>

More related stories<http://www.nature.com/news/tensions-grow-as-data-mining-discussions-fall-apart-1.13130#related-links>

Rather than struggle through a thicket of different permissions set by
publishers, some researchers want Europe to exempt text mining from
copyright law — allowing them to run programs on content that they have
paid for, and on free content, without fear of copyright breach. Last year,
the UK government said that it plans to introduce exemptions for
non-commercial purposes. Lenient ‘fair use’ rights in the United States may
already allow text mining, depending on how the law is interpreted.

“There is an intense debate on this within the scientific and research
community, with a large number of scientists pointing at the limits of the
current copyright regulatory regime,” says Ryan Heath, a spokesman for
European Commission vice-president Neelie Kroes. “This is a very serious
issue, impacting on scientific excellence and innovation in Europe.”

To tackle the issue, last December the commission set up a working group —
one of a number under a framework called Licences for Europe — to open
discussions about new policies among publishers, researchers, librarians
and other interested parties, such as technology companies. In late
February, researchers complained in a letter to the commission that the
group was constrained to discuss only text-mining licences, and not changes
to copyright law (see *Nature* *495,* 295;
2013<http://www.nature.com/uidfinder/10.1038/495295a>)
— a restriction that would “make computer-based research in many instances
impossible”.

“Every researcher I’ve spoken to thinks licensing is a problem,” says Susan
Reilly, projects manager at the Association of European Research Libraries
in the Hague, the Netherlands. She coordinated the letter that declared the
22 May withdrawal from talks. “There was really no point in us continuing
to attend,” she says. Other signatories include the non-profit Open
Knowledge Foundation in Cambridge, UK, and the National Centre for Text
Mining at the University of Manchester, UK.

“Continuing the group under current circumstances doesn’t make sense,” says
Heath. “This is regrettable, but at least the process brought to the fore
the major controversies in this area.” The European Commission, he adds,
“will reflect on the implications and will address the matter at the time
of the review of the Licences for Europe process in July”.

The European talks had always been conflicted because four different
European Union administrative departments were involved — not only the
department for research and innovation, but also those for education and
culture, for media and information issues, and for Europe’s internal
market, economy and intellectual-property rights. (The May letter argues
that the research department is being squeezed out in favour of the others’
interests.)

“Since the Licences for Europe process has not managed to deliver in this
area, other ways forward must be explored,” says Heath. An analysis under
way by the commission’s internal-market department on the need for
copyright reform may provide impetus for action, should it conclude that
changes are needed.

Many publishers say that there are practical, as well as legal, barriers to
text mining. Even if the practice were permitted through licences or
changes to copyright law, researchers would still need a way to access
websites without crippling publisher servers through excess traffic. And
publishers want to be able to identify the purpose of the programs crawling
their content, especially if mining is for commercial means, so as to
decide “what they’re willing to allow at what cost”, says Sarah Faulder,
chief executive of the Publishers Licensing Society in London, an industry
body that took part in the talks.

To lower some of these practical barriers, the non-profit publisher
collaboration CrossRef hopes to launch technology this year enabling
text-mining researchers to agree to terms by clicking a button on a
publisher’s website.

Discussions may have faltered, but scientists and librarians hope to keep
talking to officials, says Reilly. “There’s lots of disagreement even among
publishers,” she says. “Some are open to text and data mining, some are
completely frightened of it. They need an informed discussion.”
Nature  498, 14–15 (06 June 2013)  doi:10.1038/498014a

-- 
*Carolina Rossini*
http://carolinarossini.net/
+ 1 6176979389
*carolina.rossini em gmail.com*
skype: carolrossini
@carolinarossini

-- 
*Carolina Rossini*
http://carolinarossini.net/
+ 1 6176979389
*carolina.rossini em gmail.com*
skype: carolrossini
@carolinarossini
-------------- Próxima Parte ----------
Um anexo em HTML foi limpo...
URL: <http://lists.okfn.org/pipermail/cienciaaberta/attachments/20130617/e9258f8a/attachment.html>