[open-science] Publishing curated email lists

William Waites ww at eris.okfn.org
Thu Jun 23 20:31:18 UTC 2016


Alexandre Hannud Abdo <abdo at member.fsf.org> writes:

> I wouldn't go that far

Well, ok, I overstated. I should have said, if you do not have a good
understanding of the medium that you are using, how it operates, who
operates it, what their motivations are, and who can see what and when,
then the only safe attitude to take is that it is public.

> public places on the Internet are not like your bar down the street

Quite. In that sense, it is more public than the everyday sense of the
word.

Stacy Konkiel <stacy at altmetric.com> writes:

> If that's a "delusional" thought to have... everyone who spoke out
> after the recent OKCupid/OSF data archiving kerfuffle.

I wasn't aware of that kerfuffle (haven't really been paying attention
much). It's a perfect example of people misjudging the privacy
properties of the medium they were using to communicate. A bunch of
people entered information into a web site without properly
understanding the ways that the information could get out. They erred in
their decision to trust the assurances in OKCupid's terms and conditions
that it wasn't allowed.  And they got bitten. People do things that
aren't allowed.  It's unfortunate, but entirely predictable.

Perhaps you could say that this data was ill-gotten, in violation of the
rules, so shouldn't be easily available to researchers (apparently it's
of a kind that is rare and valuable for research in a certain
field). However contrast: when Wikileaks started publishing the
diplomatic cables that came from Chelsea Manning, I suggested that they
should be put on CKAN. They are also rare and valuable for research in a
certain field, but likewise ill-gotten, obtained in violation of the
rules. The US military stored and distributed these documents in a way
that was at odds with the medium they were using (if they can make this
same error, surely we cannot blame the poor OKCupid users too much).

At the time OKFN felt it was too risky and was concerned with licensing
(hah!). But I suspect most people in the scientific community would say
it's a good thing to have them discoverable and searchable and that they
are legitimately useable as source material. So what's the difference?
Is it that the OKCupid users are individuals? Maybe. But they did
explicitly share that information with anybody who could make an OKCupid
account which is, in principle, everybody. Unlike with the diplomatic
cables where it wasn't possible for just anyone to walk in off the
street and get them.

This now wanders a bit from the thread topic which is about archiving
and making discoverable fora that are explicitly public. Both of the
above examples are corner cases where there is at least some ethical
dilemma. That's not at all the case for listserv archives.

Cheers,
-w



More information about the open-science mailing list