[ckan-discuss] Despamming datahub

Mark Wainwright mark.wainwright at okfn.org
Tue Mar 19 12:39:34 GMT 2013


Here's a suggestion I made about spam control a while ago. Spam users
are heuristically identified, and an admin user can get a list of a
screenful of them at a time (like your spam mail folder). You check or
uncheck the tick boxes to confirm which ones are spammers and hit the
big red 'nuke' button. It deletes and purges the spam users and all
their created groups and datasets, and undoes any revisions they've
made elsewhere (as long as the dataset hasn't been edited later).

What do you think?

Mark

On 19/03/2013, Ross Jones <ross at servercode.co.uk> wrote:
> Definitely think a demo account with a simple password is a good plan.
>  Setting up the demo account should be straight-forward, and running a task
> to delete the data every 4 hours (where that data is more than 4 hours old)
> equally so.  The verified accounts would be the only potential stumbling
> block to doing this. Also, I'd love to see a feature where accounts can
> have logins enabled/disabled as well, to make it easier to manage.
>
> Ross.
>
>
> On Tue, Mar 19, 2013 at 10:26 AM, Velichka Dimitrova <
> velichka.dimitrova at okfn.org> wrote:
>
>> I'd also support some higher entry barrier for long-term deposits of
>> metadata or data.
>>
>> In terms of being open - I think it is valuable to have a site which
>> users
>> can access and upload data or links to straight away in order to see what
>> the possible features and benefits are. Maybe this could be some demo
>> feature - where such "experimental uploads" are deleted after several
>> hours.
>>
>> For groups and communities which are using / would like to use Datahub as
>> an actual portal and metadata repository, some user verification process
>> should not be a problem, it would not reduce openness in my opinion.
>>
>> Velichka Dimitrova
>> Open Economics Project Coordinator
>> Open Knowledge Foundation
>> http://okfn.org | http://openeconomics.net
>>
>>
>>
>>
>> On 19 March 2013 10:17, Ross Jones <ross at servercode.co.uk> wrote:
>>
>>> No problem, I wrote some code to do it.  It isn't a perfect solution, a
>>> lot of users creating these groups need to be deleted, and I am nervous
>>> about doing that without better heuristics than their choice of email
>>> provider.
>>>
>>> I think it might be better to force users through either an email
>>> verification process, or to provisionally put their first group/dataset
>>> creation into a pending state for moderation.  I realise this would
>>> reduce
>>> how open the site is, but I can't see another viable way to reduce all
>>> the
>>> spam.
>>>
>>> Ross
>>>
>>>
>>>
>>> On Tue, Mar 19, 2013 at 9:22 AM, Velichka Dimitrova <
>>> velichka.dimitrova at okfn.org> wrote:
>>>
>>>> Excellent work, Ross! I agree with Mark.
>>>>
>>>> Thank you very much.
>>>>
>>>> Velichka Dimitrova
>>>> Open Economics Project Coordinator
>>>> Open Knowledge Foundation
>>>> http://okfn.org | http://openeconomics.net
>>>>
>>>>
>>>>
>>>>
>>>> On 18 March 2013 15:28, Mark Wainwright
>>>> <mark.wainwright at okfn.org>wrote:
>>>>
>>>>> This is great, Ross! Thanks for this. The spam groups have been
>>>>> disfiguring the DataHub for a while. Now the groups page looks much
>>>>> happier:
>>>>>
>>>>> http://datahub.io/group
>>>>>
>>>>> >>  Would it be possible for someone to turn off group-creation until
>>>>> the
>>>>> >> datahub gets migrated to 2.0?  I guess any urgently needed groups
>>>>> could
>>>>> >> approach the list of ask in the meantime.
>>>>>
>>>>> Hopefully one of the devs will step in ...
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On 18/03/2013, Ross Jones <ross at servercode.co.uk> wrote:
>>>>> > Hi Sara,
>>>>> >
>>>>> > It shouldn't be an issue on 1.x, it depends how open you want it to
>>>>> be, but
>>>>> > as it is probably best to be working with 2.0 now...
>>>>> >
>>>>> > It should certainly be less of an issue on 2.0, unless you leave it
>>>>> open to
>>>>> > anybody to create a group.  I guess ideally for open systems it
>>>>> should be
>>>>> > possible to have the group go into a pending state for verification
>>>>> before
>>>>> > it was created.  I haven't checked whether this is the case or not
>>>>> yet.
>>>>> >  Even if not in core it should be possible to do this in an
>>>>> > extension.
>>>>> >
>>>>> > Ross
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Mon, Mar 18, 2013 at 2:38 PM, Sara Farmer
>>>>> > <sara.farmer at btinternet.com>wrote:
>>>>> >
>>>>> >>  Phew...  Does that mean that spammers using fake groups won't be
>>>>> >> an
>>>>> >> issue in 2.0?
>>>>> >>
>>>>> >> I'm asking because I just set up my own CKAN node and am clunking
>>>>> >> my
>>>>> way
>>>>> >> through getting it working for me... I locked it down because I saw
>>>>> all
>>>>> >> the
>>>>> >> spam on the datahub, and wasn't quite sure what the best way to
>>>>> avoid the
>>>>> >> same fate was (and as the receipient of more than one "we've closed
>>>>> your
>>>>> >> site because of spammers" message from providers in the past, this
>>>>> scares
>>>>> >> me a little more than most).
>>>>> >>
>>>>> >> Thanks,
>>>>> >>
>>>>> >>
>>>>> >> Sj.
>>>>> >>
>>>>> >>
>>>>> >> On 3/18/2013 10:04 AM, Ross Jones wrote:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >>  I've started de-spamming the datahub, there were *lots* of fake
>>>>> groups
>>>>> >> created, and as there are some fairly easy heuristics in
>>>>> >> identifying
>>>>> them
>>>>> >> (thanks hotmail) I've written a script that'll mark them all as
>>>>> deleted.
>>>>> >>
>>>>> >>  I'm only soft-deleting them (just in case) and unfortunately users
>>>>> don't
>>>>> >> have that option so I've erred on the side of caution and
>>>>> >> temporarily
>>>>> >> left
>>>>> >> them (until I can come up with a safer set of rules).
>>>>> >>
>>>>> >>  Would it be possible for someone to turn off group-creation until
>>>>> the
>>>>> >> datahub gets migrated to 2.0?  I guess any urgently needed groups
>>>>> could
>>>>> >> approach the list of ask in the meantime.
>>>>> >>
>>>>> >>  Ross
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> ckan-discuss mailing
>>>>> >> listckan-discuss at lists.okfn.orghttp://
>>>>> lists.okfn.org/mailman/listinfo/ckan-discuss
>>>>> >> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-discuss
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> No virus found in this message.
>>>>> >> Checked by AVG - www.avg.com
>>>>> >> Version: 2013.0.2904 / Virus Database: 2641/6183 - Release Date:
>>>>> 03/16/13
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> _______________________________________________
>>>>> >> ckan-discuss mailing list
>>>>> >> ckan-discuss at lists.okfn.org
>>>>> >> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>>>> >> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-discuss
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>> _______________________________________________
>>>>> ckan-discuss mailing list
>>>>> ckan-discuss at lists.okfn.org
>>>>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>>>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-discuss
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> ckan-discuss mailing list
>>>> ckan-discuss at lists.okfn.org
>>>> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>>>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-discuss
>>>>
>>>>
>>>
>>
>



More information about the ckan-discuss mailing list