[okfn-labs] public bodies project and documentation

Mon Jul 15 17:49:31 UTC 2013

Just created the pull request at
https://github.com/okfn/publicbodies/pull/42 , which adds the Brazilian
federal public bodies of the executive branch.

Now I have even more questions about the meaning of the columns. I had
thought that "parent" and "parent_key" referred to when a body was somehow
subordinate or hierarchically placed under another body. But the
application adds the label "Parent category", which seems to imply
otherwise.

And I still don't get how categories are supposed to work, as the datasets
lack any examples whatsoever of categories.

Best,
Augusto Herrmann

On Wed, Jul 10, 2013 at 3:17 PM, Augusto Herrmann <
augusto.herrmann at gmail.com> wrote:

> Hi, all!
>
> I've seen the publicbodies.org project, seems cool, and the site invites
> discussions to this list. However, browsing through the list archives, I
> couldn't find any discussions about it.
>
> Anyway, what I would like to ask and discuss here is the types and
> meanings of each column [1] in the csv files. I think a description of each
> of them should be provided as project documentation. Following, I have some
> questions regarding some of those fields.
>
> 1) "updated"
> The "updated" field should be in which format? ISO 8601?
> Does it mean the date and time when data has actually last changed values,
> or just the moment when the data was checked back for conformance with the
> original source?
>
> 2) "slug"
> Should the slug generated specifically for this project, in case there is
> not an existing official one? Any guidelines as for how to do so (list of
> allowed characters, character substitution rules, what to do with accented
> chats, and so on).
>
> 3) "category"
> Should this be an officially labeled category, if one exists? Or is it a
> categorization effort specific to this project? Is there one such list of
> categories to choose from, and if so, where is it?
> Should we use numerical codes or textual descriptions of categories?
> I've looked into the current csv data searching for examples, but the
> fields seem to be empty so far.
>
> 4) "jurisdiction" and "jurisdiction_code"
> What should go into this?
>
> 5) "address"
> How to encode this? Street names, city, etc., all conflated into a single
> string? Also, there's no field for a postal code.
>
> 6) "contact"
> I'm guessing this is the name of the contact for which the "email" field
> corresponds to (whether the name of a department within the public body's
> structure or a person).
>
> 7) "tags"
> How to set those?
>
> Besides, I think it would be useful to include somewhere in the repository
> [2] the scripts that have been used to extract the information from the
> available open data source into the csvs. That way, the data can easily be
> updated again by anyone by just running the scripts. How should we name the
> directory where those scripts would go? "scripts"? "import"?
>
> Last but not least, some good news. I've been working on a script to load
> the Brazilian federal government's organizational structure (4910 public
> bodies) into this dataset.
>
> [1] https://github.com/okfn/publicbodies#building-the-sqlite-db
> [2] https://github.com/okfn/publicbodies
>
> Best regards,
> Augusto Herrmann
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20130715/86da004c/attachment-0002.html>