[okfn-labs] CrowdCrafting application to get more info on Public Bodies (starting with EU)

Friedrich Lindenberg friedrich.lindenberg at okfn.org
Mon Nov 4 12:08:31 UTC 2013


Another random note: the Publications Office maintains the EU Who Is Who,
which they announced some time back would be released as open data. Doesn't
look like it's on the Commission portal yet - perhaps worth pinging them?

- Friedrich


On Mon, Nov 4, 2013 at 12:14 PM, Friedrich Lindenberg <
friedrich.lindenberg at okfn.org> wrote:

> Hi all,
>
> for what its worth, I've uploaded a list of all European (+member state)
> public bodies that have awarded major public contracts over the last 6
> months: http://opendatalabs.org/misc/authorities.csv
>
> This set is going to grow from now on, so it may make sense to consider
> how we can use it as a data source for publicbodies.eu in the future.
> Using CrowdCrafting to clean it up would be ideal!
>
> What do people think?
>
> - Friedrich
>
>
>
> On Mon, Nov 4, 2013 at 10:49 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>
>> On 30 October 2013 07:57, Daniel Lombraña González <teleyinex at gmail.com>wrote:
>>
>>> Hi Rufus,
>>>
>>> The application should be easy to build. I've just imported the CSV in
>>> the Github repository and automatically we have 129 tasks to deliver to our
>>> users :-)
>>>
>>
>> Great - do you have a link to the app?
>>
>>
>>>  The main problem will be how to extract those 2 paragraphs from the
>>> web pages, and make them consistent when 2 or 3 people enter the same
>>> information. I've checked two pages (note in some rows you don't have the
>>> URL) and they do not follow the same structure at all. Thus, the
>>> instructions should be to find in the web page something like "Who we are",
>>> "What we do", "About us", etc. Select 1 paragraph, copy it, and paste it.
>>> The same for the other fields.
>>>
>>
>> I don't think we want to try and auto-extract something but leave to
>> users. Also this is something where I'd set it up so you only need to do
>> each task once (at least in first instance) so different stuff from
>> different people is less of an issue.
>>
>> What I would do would be to link to website with target=_blank (or
>> perhaps an iframe??) and give some good instructions on what we want e.g.
>> "Please provide a brief description of this organization in English. The
>> description should start with a single summary sentence and be
>> approximately 1-2 paragraphs in length. Copying and pasting directly from
>> the website is fine".
>>
>> The one tweak I would make is to not allow them to submit a result if the
>> description is less than, say, 40 words, or more than, say, 250 words.
>>
>> Rufus
>>
>>
>>>
>>> What do you think?
>>>
>>> Cheers,
>>>
>>> Daniel
>>>
>>>
>>> On Tue, Oct 29, 2013 at 9:53 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:
>>>
>>>> Hi Daniel (and all others interested!),
>>>>
>>>> On PublicBodies.org many entries lack good descriptions. I think this
>>>> would be a great opportunity to use CrowdCrafting.org
>>>>
>>>> As a starting point we could focus just on the EU which has a
>>>> relatively limited number of Bodies (just 129 at the moment):
>>>>
>>>> http://publicbodies.org/eu
>>>>
>>>> We already have URLs so all we would need is for the app to take that
>>>> URL and prompt users to enter a short description (1-2 paragraphs)
>>>> (probably copied from the source website). There's more in this issue here:
>>>>
>>>> https://github.com/okfn/publicbodies/issues/35
>>>>
>>>> What do you think?
>>>>
>>>> Rufus
>>>>
>>>>
>>>
>>>
>>> --
>>> Daniel Lombraña González <http://daniellombrana.es> *::* Blog<http://daniellombrana.es/blog/>
>>> *::* @teleyinex <https://twitter/teleyinex>
>>> *Project Lead, Lead Developer :: *Crowdcrafting.org *::* PyBossa<http://dev.pybossa.com>
>>> *Fellow* *::* The Shuttleworth Foundation<http://www.shuttleworthfoundation.org/fellows/daniel-lombrana/>
>>> *Senior Researcher :: *Citizen Cyberscience Centre<http://citizencyberscience.net>
>>>  <http://citizencyberscience.net>
>>>
>>> ··························································································································································
>>> Please do NOT use proprietary file formats to share files
>>> like DOC or XLS, instead use PDF, HTML, RTF, TXT, CSV or
>>> any other format that does not impose on the user the employment
>>> of any specific software to work with the information inside the files.
>>>
>>> ··························································································································································
>>> Por favor, NO utilice formatos de archivo propietarios para el
>>> intercambio de documentos, como DOC y XLS, sino PDF, HTML, RTF, TXT, CSV
>>> o cualquier otro que no obligue a utilizar un programa de un
>>> fabricante concreto para tratar la información contenida en él.
>>>
>>> ··························································································································································
>>>
>>
>>
>>
>> --
>>
>>
>> * Rufus Pollock Founder and Executive Director | skype: rufuspollock |
>> @rufuspollock <https://twitter.com/rufuspollock> The Open Knowledge
>> Foundation <http://okfn.org/> Empowering through Open Knowledge
>> http://okfn.org/ <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | OKF
>> on Facebook <https://www.facebook.com/OKFNetwork> |  Blog
>> <http://blog.okfn.org/>  |  Newsletter <http://okfn.org/about/newsletter> *
>>
>> _______________________________________________
>> okfn-labs mailing list
>> okfn-labs at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/okfn-labs
>> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20131104/ce8a41bd/attachment-0002.html>


More information about the okfn-labs mailing list