[okfn-labs] public bodies project and documentation
rufus.pollock at okfn.org
Mon Jul 15 19:11:09 UTC 2013
Sorry for the slow reply here!
On 10 July 2013 19:17, Augusto Herrmann <augusto.herrmann at gmail.com> wrote:
> Hi, all!
> I've seen the publicbodies.org project, seems cool, and the site invites discussions to this list. However, browsing through the list archives, I couldn't find any discussions about it.
There have been a few ;-)
> Anyway, what I would like to ask and discuss here is the types and meanings of each column  in the csv files. I think a description of each of them should be provided as project documentation. Following, I have some questions regarding some of those fields.
I note there is an issue about reworking and documenting the schema
You are quite right that this is inadequate at present. I've just
added a basic description for current setup, see the updated
And more easy to read version:
> 1) "updated"
> The "updated" field should be in which format? ISO 8601?
> Does it mean the date and time when data has actually last changed values, or just the moment when the data was checked back for conformance with the original source?
Yes this should be ISO 8601 but frankly I think we should deprecate
this field. (People will forget to update it ...)
> 2) "slug"
> Should the slug generated specifically for this project, in case there is not an existing official one? Any guidelines as for how to do so (list of allowed characters, character substitution rules, what to do with accented chats, and so on).
Slug is not an official field I believe. Instead there is key.
> 3) "category"
> Should this be an officially labeled category, if one exists? Or is it a categorization effort specific to this project? Is there one such list of categories to choose from, and if so, where is it?
> Should we use numerical codes or textual descriptions of categories?
> I've looked into the current csv data searching for examples, but the fields seem to be empty so far.
Good question. I think this field should behave like (and probably be
renamed to) classification in the popolo org spec
http://popoloproject.com/specs/organization.html However, AFAICT that
does not specify a recommended list of categories to use.
> 4) "jurisdiction" and "jurisdiction_code"
> What should go into this?
See updated description.
> 5) "address"
> How to encode this? Street names, city, etc., all conflated into a single string? Also, there's no field for a postal code.
Just one long string. I'd suggest structuring as one would read it on
a site (ie. add line terminators). (Need to add this to description)
> 6) "contact"
> I'm guessing this is the name of the contact for which the "email" field corresponds to (whether the name of a department within the public body's structure or a person).
> 7) "tags"
> How to set those?
I think we deprecate tags.
> Besides, I think it would be useful to include somewhere in the repository  the scripts that have been used to extract the information from the available open data source into the csvs. That way, the data can easily be updated again by anyone by just running the scripts. How should we name the directory where those scripts would go? "scripts"? "import"?
Huge +1 on this point. Can you open an issue for this.
> Last but not least, some good news. I've been working on a script to load the Brazilian federal government's organizational structure (4910 public bodies) into this dataset.
>  https://github.com/okfn/publicbodies#building-the-sqlite-db
>  https://github.com/okfn/publicbodies
> Best regards,
> Augusto Herrmann
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> Unsubscribe: http://lists.okfn.org/mailman/options/okfn-labs
Founder and Co-Director | skype: rufuspollock | @rufuspollock
The Open Knowledge Foundation
Empowering through Open Knowledge
http://okfn.org/ | @okfn | OKF on Facebook | Blog | Newsletter
More information about the okfn-labs