[ckan-discuss] Spam on the data hub

Rufus Pollock rufus.pollock at okfn.org
Tue Aug 7 19:30:24 BST 2012

On 7 August 2012 18:18, Richard Cyganiak <richard at cyganiak.de> wrote:
> All,
> Looks like there was another big spam attack on the Data Hub a couple of days ago. There are some 2000 spam datasets. Look at the revision list, circa pages 16 to 121. Looks like it all happened between August 1 and August 3.

We discovered it Friday morning and after various unsuccessful
attempts to block shut it down Friday evening.

> I started cleaning some of it up, but had to give up when I realized just how much it is.

Yes :-/

> The user names and dataset IDs follow a predictable pattern, so I guess someone with access to the backend could script something to clean this up?

Exactly. We're working on this and it should be gone in the next day or so.

> How can this kind of attack be prevented in the future?

Spam is one of those recurring whack-a-mole problems. Each time we've
had spam problems we have found a way to address it. E.g. last year we
moved to captchas for sign-ups and sign-up required for editing --
however the latest spam attack got past this.

I think the only real solution is vigilance and dealing quick with bad
attacks like this one as quickly as possible when they happen.


