[open-science] github/R stack for the nomadic researcher

Jessy Kate Schingler jessy at jessykate.com
Wed Apr 11 18:03:55 UTC 2012


to play devil's advocate :)

i think sites like github and wordpress and all the other defacto hosted
tools are successful specifically *because* they cross community
boundaries, and as a result encourage cross pollination and collaboration,
and focus efforts and (human/dollar) support. if there were are data hubs
for each possible community, then i'm worried we just end up with
fragmentation of efforts, and confusion on the part of the user ("gee, do i
post to the CS data hub or the web development data hub? oh whatever i'll
just do it later.").

on the other hand, as scientists posting to sites like thedatahub, we
actually increase exposure to our data and probability of re-use/re-mixing,
and hopefully help to dispel the notion that there is anything mysterious
or special about "real" scientists' data. we're right in there with the
data nerds and the software developers and the database admins and the non
profits and the inter-governmentals sharing and refining and asking
questions about their data. seems better for us all...

to be clear, if we need more human resources to support operations or scale
up what thedatahub.org is capable of handling, i think we should definitely
do that and am happy to help on the sysadmin side, but IMHO we would reap
greater rewards by creating a defacto place on the web that does this well
for all, than by setting up a separate community.

my 2c!
jessy


On Wed, Apr 11, 2012 at 7:48 AM, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:

>
>
> On Wed, Apr 11, 2012 at 7:45 AM, Jessy Kate Schingler <jessy at jessykate.com
> > wrote:
>
>> do people think a separate instance of ckan would be useful for the open
>> data/science community at large? or is it an issue of marketing what we
>> have (thedatahub) better?
>>
>> if the former, i'm happy to help w system administration, but it's not
>> obvious to me... curious what others think!
>>
>>
> I think we should have a separate science-datahub.. I showed datahub to
> the European Horizon2020 today - very briefly..
>
>
>> jessy
>>
>>
>> On Tue, Apr 10, 2012 at 1:25 AM, Mark Wainwright <
>> mark.wainwright at okfn.org> wrote:
>>
>>> Yes indeed! Perhaps I could mention this submission that I threw
>>> together for the Open Repositories conference OR12
>>> (http://or2012.ed.ac.uk):
>>>
>>> http://ckan.okfnpad.org/or12
>>>
>>> My idea was that we could boot a new instance of ckan specialised for
>>> research papers (slightly facetiously called thepaperhub.org), but I
>>> don't know how easy this is, or whether there would be enthusiasm from
>>> someone technically literate to keep it running. (Volunteers?)
>>> Meantime thedatahub.org is a good option.
>>>
>>> I gather OR12 will be accepting/rejecting submissions on 16 April,
>>> incidentally.
>>>
>>> Mark
>>>
>>>
>>> On 2 April 2012 20:01, Peter Murray-Rust <pm286 at cam.ac.uk> wrote:
>>>
>>> > On Mon, Apr 2, 2012 at 7:23 PM, Jessy Kate Schingler <
>>> jessy at jessykate.com>
>>> > wrote:
>>> >>
>>> >> i agree on the dataforge front...  git doesn't handle large files
>>> well,
>>> >> and figshare, buzzdata etc. seem to be mostly for visual or tabular
>>> data
>>> >> sets. out of curiosity, as i'm starting to learn about thedatahub.com
>>> ,
>>> >
>>> >
>>> > thedatahub.org I think
>>> >
>>> >>
>>> >> it seems rather perfect for data set management, and even has a change
>>> >> lists for data sets, groups, user pages, etc. (especially if there
>>> were some
>>> >> command line tools so i could "commit" changes to my data set
>>> periodically
>>> >> and upload them :)).
>>> >>
>>> >> is there a reason people find ckan/thedatahub insufficient for data
>>> >> management needs? is it related to technical/features, or to peoples'
>>> >> familiarity and confidence around the longevity of the site?
>>> >
>>> >
>>> > It's history, I think. We should now be making the case for such a
>>> > repository and I don't think Figshare is it. I have rather negelected
>>> > datahub because the original CKAN was metadata-oriented.
>>> >
>>> > I'll be making the case in Europe next week that we badly need informal
>>> > repositories and maybe this is the time to push the datahub?
>>> >
>>> > P.
>>> >
>>> >
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Apr 2, 2012 at 12:05 AM, Peter Murray-Rust <pm286 at cam.ac.uk>
>>> >> wrote:
>>> >>>
>>> >>> Tom,
>>> >>> This is a really valuable post. I feel your concerns directly. I have
>>> >>> copied in our new Panton fellows (though I am sure they read this
>>> list
>>> >>> anyway!)
>>> >>>
>>> >>> On Sun, Apr 1, 2012 at 11:16 PM, Tom Roche <Tom_Roche at pobox.com>
>>> wrote:
>>> >>>>
>>> >>>>
>>> >>>> [apologies for length of post, but it's a big topic]
>>> >>>
>>> >>>
>>> >>> No apologies needed!
>>> >>>
>>> >>> I am giving an important presentation to  Europe "Open
>>> Infrastructures
>>> >>> for Open Science" and Neelie Kroes and others will be there. I am
>>> getting my
>>> >>> thoughts together as I have to give the plenary that informs the
>>> rest of the
>>> >>> workshop. Currently my thoughts are:
>>> >>>
>>> >>> Europe (and the world) is losing 10 billion + in unused and
>>> restricted
>>> >>> data. (I said this to Hargreaves)
>>> >>> We MUST have easily accessible research repositories, probably on a
>>> >>> domain basis (Dryad, Pangaea, TARDIS, etc.)
>>> >>> Institutional Repos do not work for STM and never will
>>> >>> Mandates are a blunt weapon and so far have little effectiveness
>>> >>> Non-Commercial destroys knowledge
>>> >>>
>>> >>> We must give the researchers something they want. Sourceforge does
>>> this
>>> >>> for code. I use Sourceforge (actually now Bitbucket and Github)
>>> several
>>> >>> times a day. All my code is backed up, shareable, reusable,
>>> validated etc.
>>> >>>
>>> >>> There must be a "Data forge" for Europe. Figshare was built by one
>>> >>> graduate student in one year. I would give 3rd year graduate students
>>> >>> funding to do this - it's a hundred times more cost effective than
>>> >>> repositories.
>>> >>>
>>> >>> I'd like to collect ideas on this llist and present them next week
>>> >>> (11th). An OKF data manifesto for Open Science (in Europe) Who knows
>>> what
>>> >>> might come?
>>> >>>
>>> >>>
>>> >>>
>>> >>>>
>>> >>>>
>>> >>> --
>>> >>> Peter Murray-Rust
>>> >>> Reader in Molecular Informatics
>>> >>> Unilever Centre, Dep. Of Chemistry
>>> >>> University of Cambridge
>>> >>> CB2 1EW, UK
>>> >>> +44-1223-763069
>>> >>>
>>> >>> _______________________________________________
>>> >>> open-science mailing list
>>> >>> open-science at lists.okfn.org
>>> >>> http://lists.okfn.org/mailman/listinfo/open-science
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Jessy
>>> >> http://jessykate.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Peter Murray-Rust
>>> > Reader in Molecular Informatics
>>> > Unilever Centre, Dep. Of Chemistry
>>> > University of Cambridge
>>> > CB2 1EW, UK
>>> > +44-1223-763069
>>> >
>>> > _______________________________________________
>>> > open-science mailing list
>>> > open-science at lists.okfn.org
>>> > http://lists.okfn.org/mailman/listinfo/open-science
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Mark Wainwright, CKAN Community Co-ordinator
>>> Open Knowledge Foundation http://okfn.org/
>>> Skype: m.wainwright
>>>
>>> _______________________________________________
>>> open-science mailing list
>>> open-science at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/open-science
>>>
>>
>>
>>
>> --
>> Jessy
>> http://jessykate.com
>>
>>
>> _______________________________________________
>> open-science mailing list
>> open-science at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/open-science
>>
>>
>
>
> --
> Peter Murray-Rust
> Reader in Molecular Informatics
> Unilever Centre, Dep. Of Chemistry
> University of Cambridge
> CB2 1EW, UK
> +44-1223-763069
>



-- 
Jessy
http://jessykate.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-science/attachments/20120411/d78f0aa5/attachment-0001.html>


More information about the open-science mailing list