[OpenSpending-discuss] Anyone willing to help with getting basic data into openspending?

Lucy Chambers lucy.chambers at okfn.org
Thu Nov 3 18:41:36 UTC 2011


Hi Michal,

I can tackle a couple of these points here and now, then I'll hand over to
the technical team for deeper help.

http://wiki.openspending.org/**Data_Format<http://wiki.openspending.org/Data_Format>
> Section Required data says about Date: "This is generally expected to be a
> single year.", but the next section Column formatting says: "Dates have to
> be in the format YYYY-MM-DD" So, is it "2010" or "2010-12-31", or both?
>
> OK, this was unclear, have changed wording to :

"In most datasets we have seen, this is a single year, but other timespans
are possible. If your budgeting period spans across several years, choose
the year in which it begins. See further information on date
columns<http://wiki.openspending.org/Data_Format#Date_columns>
"

Essentially, either option is possible.


> http://wiki.openspending.org/**Model_Format<http://wiki.openspending.org/Model_Format>
> (that's the part where I begin to be lost)
> Section Model format
> "The model must be a valid JSON file, which contains a JSON object with
> "dataset" at its root "
> But the example shows 'dataset', 'mapping' and 'views' at the same level.
>

@team - can you clarify?

>
> Section Dataset metadata: I have realized just now, during 3rd reading,
> that there is an array required for unique_keys. A simple example below the
> definition would be great. (It might be a trouble that I am not used to
> json-s (I generate them from php), so one overlooks something easily.)
>

@pudo / Martin -  in Michal's data, the unique_key column is an integer
sequence. I thought this was accepted, but if it isn't, we should explain
why in the docs... could we give a useful example?

@Michal, to my knowledge, this is accepted, however it may be better in the
long run to go for:

either unique_keys = ["year", "unique_id_in_year"]

or just a unique_keys = ["unique_id"] with unique ids of "2009_01",
"2009_02", etc.

i.e. this way - if you do updates of the data at subsequent points, the
system will be able to reference which records are already stored in the
database.

>
> Section View definitions: I do not get this at all. Paragraph 'To explain'
> is an example, however the sample code is different - includes names
> 'function', 'subfunction', which I cannot imagine what they shall mean in
> this context; breakdown and filters - no idea how to construct them from my
> data.
>

@team, could you deal with this one?


>
> http://wiki.openspending.org/**Mapping_Format<http://wiki.openspending.org/Mapping_Format>
> (that's the part I get really lost)
> Section Mappings
> "There exist four mandatory dimensions", but in
> http://wiki.openspending.org/**Data_Format<http://wiki.openspending.org/Data_Format>,
> section Required data, it says only two columns are required: Date and
> Amount (how can it be that 'to' and 'from' are not required in data
> format/csv file?)
>

I have updated this now in the Data_Format page, there are a minimum of 4
columns, you are correct...


> What are required information about any column in the db? The example
> shows 6 of them (type, description, label, datatype, default_value, column).
>

@team


> What is "default_value" for, if the http://wiki.openspending.org/**
> Data_Format <http://wiki.openspending.org/Data_Format>, section Columns
> says "each cell below the heading must be non-empty" ?
>

Default value is simply something which is inserted as a placeholder if a
particular column has all the same entries. e.g. if all your recipients are
'society' (e.g. your original dataset does not contain a 'to' field, you
can tell the system to presume the 'to' column is full of 'society'
entries). You may not require this, it depends on your data.


> Section Types->Column types: what is the type for "year" (e.g. value
> "2010") - float or string (or date?) ? As there is no "integer" there.
> What is 'currency' for, if the currency (e.g. 'CZK') is specified in
> "dataset" part of the model.


@team


> Section Field descriptions
> What are fields? They come from nowhere, there are no fields in the first
> main example (="Overall, a mapping resembles the following") What is the
> difference between 'column' and 'fields', there are 'column's in some
> examples and 'fields' in others.
>

@team


> This is only partly about the documentation, but the process continues:
> http://etl.sandbox.**openspending.org/load/**preflight/aris-test-2010<http://etl.sandbox.openspending.org/load/preflight/aris-test-2010>
> Why are there 3 URLs? Are all three required?
> http://wiki.openspending.org/**Preparing_Datasets<http://wiki.openspending.org/Preparing_Datasets>section Upload the data and the model to CKAN says "OpenSpending expects
> your CKAN package to reference (at least) two files", the whole
> documentation speak only about these two files (with 'mapping' required in
> the json file, see http://wiki.openspending.org/**Model_Format<http://wiki.openspending.org/Model_Format>section Model format). What about the "model:mapping URL" ? It is just
> confusing.
>

We are working on simplifying this process - you are totally right,
hopefully, it will not be so for long.

>
> What I am really missing:
> One single simple example which would be used consistently through the
> whole documentation (e.g.,http://sandbox.**openspending.org/dataset/**
> openspending-example<http://sandbox.openspending.org/dataset/openspending-example>,
> I'd prefer to have two years, not one)
>
> We will try and find you one!


> And for my own dataset:
> I get: These errors were found when attempting to validate your model:
>  - 'model.mapping.time' field had error 'Required'
> And the views (their definition) make no sense to me so far (how to define
> the hierarchy 'chapter'->'organization' as in  http://test.kohovolit.sk/m/
> **chart_3.html <http://test.kohovolit.sk/m/chart_3.html>)
>

Thanks very much for this Michal, I know it must have taken you ages, but
it is really useful to see what is not obvious. I'll grab the team first
thing in the morning to see if I can get them to spruce up the docs and
we'll be back to you asap!

All the best,

Lucy


> Best,
> Michal
>
>
>  We have some brilliant data-wranglers on this list, could you outline
>> how far you have got and where the sticking point is?
>>
>> A quick glance at your data suggests it is in the right format and
>> pretty clean, so this shouldn't take too much effort!
>>
>> All the best,
>>
>> Lucy
>>
>>
>> On Thu, Nov 3, 2011 at 3:34 AM, M.Skop KohoVolit.eu
>> <michal.skop at kohovolit.eu>  wrote:
>>
>>> Hi,
>>>
>>> we would like to use OS for very detailed Czech budget data, but I am
>>> fighting with the simple task of getting basic trial data into OS. ( I
>>> must
>>> confess that the documentation is often unclear to me, this was my 2nd
>>> attempt, different data, and still no results.)
>>>
>>> Isn't there anybody wiling to help us with the task? I believe if I can
>>> get
>>> a simple dataset running, it shall be the same for the more complex one.
>>>
>>> What I want to achieve by this trial is shown here (the same data):
>>> http://test.kohovolit.sk/m/**chart_3.html<http://test.kohovolit.sk/m/chart_3.html>
>>> One year, only a simple hierarchy 'chapter'(=group of organizations) ->
>>> 'organization', nothing more for the beginning.
>>>
>>> The data is here:
>>> http://thedatahub.org/dataset/**aris-test-2010<http://thedatahub.org/dataset/aris-test-2010>
>>> https://raw.github.com/**michalskop/BudovaniStatu.cz/**
>>> master/dev/os1.csv<https://raw.github.com/michalskop/BudovaniStatu.cz/master/dev/os1.csv>
>>> https://raw.github.com/**michalskop/BudovaniStatu.cz/**
>>> master/dev/os1.json<https://raw.github.com/michalskop/BudovaniStatu.cz/master/dev/os1.json>
>>> (which is probably the trouble)
>>>
>>> Thanks a lot,
>>> Michal
>>>
>>> --
>>> Mgr. Michal Škop, Ph.D.
>>> KohoVolit.eu
>>> michal.skop at kohovolit.eu
>>> +420 775 187 021
>>>
>>>
>>> ______________________________**_________________
>>> wdmmg-discuss mailing list
>>> wdmmg-discuss at lists.okfn.org
>>> http://lists.okfn.org/mailman/**listinfo/wdmmg-discuss<http://lists.okfn.org/mailman/listinfo/wdmmg-discuss>
>>>
>>>
>>
>>
>
> --
> Mgr. Michal Škop, Ph.D.
> KohoVolit.eu
> michal.skop at kohovolit.eu
> +420 775 187 021
>
>


-- 
Lucy Chambers
Community Coordinator
Open Knowledge Foundation
http://okfn.org/
Skype: lucyfediachambers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending/attachments/20111103/bf1c8c21/attachment.html>


More information about the openspending mailing list