[ckan-dev] Data viewer oddities

Mark Wainwright mark.wainwright at okfn.org
Tue Apr 17 08:08:14 UTC 2012


I would say this is not so much poor UX as a full blown bug :-)

However terrible or lacking the type information, and no matter what
order the points are in, when I plot __id__ against __id__ each point
should lie on the line x=y. Even if it is just using the index (in a
random order) for the x-axis, it should also be using the index for
the y-axis! The fact that it isn't doing so shows that it does know
*something* about how to order the values, in which case it should use
that info on the x-axis too.

But anyway, using the index should never, never happen, because it
gives extremely weird behaviour - people are entitled to assume at
minimum that if they ask for a graph of price against date, the
behaviour is as expected (which it totally isn't!)

I strongly agree that better type casting should happen at various
times, but I also suggest that

* the graphing should be consistent (e.g. between axes) about using
whatever type info it has
* if it can't interpret a field, it should give an error, rather than
silently replace it with index and give unexpected results.

Mark




On 16 April 2012 13:27, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> On 16 April 2012 10:14, Mark Wainwright <mark.wainwright at okfn.org> wrote:
>> There seem to be a few bugs in the data viewer, which I had a quick
>> play with when Jonathan Gray asked via Twitter: "how can I reload a
>> graph? E.g. if I want to see date on the x axis for the gold example?
>> http://thedatahub.org/dataset/gold-prices"
>>
>> (i) I believe the answer to J's question should be that it
>> automatically reloads when you choose a new variable, and I believe
>> also that it actually does this; but I suspect that Jonathan is not
>> seeing it happen because the labelling of the axes is completely
>> wrong, and doesn't change when he changes the variable from '__id__'
>> to 'date' (the shape of the graph is the same).
>>
>> But besides the axis-labelling problems, there are other problems with this.
>>
>> (ii) Try plotting 'date' against '__id__'. The graph should be a
>> straight line, but it is blank.
>>
>> (iii) Most strangely, try plotting __id__ against __id__. The result
>> is a graph that wiggles up and down in a mystifying way - it is not
>> the prices data, but nor is it the straight line one would naturally
>> expect!
>
> Have you *sorted* the data? By default the data does not come back in
> sorted form. This combined with a "pseudo-bug" regarding the handling
> of numeric data which is in string form will result in all the
> behaviour you mention:
>
> * Because the data was imported as a CSV all values were interpreted
> as *strings* when I uploaded the data
> * As such when plotting the graph Flot will simply use index of the
> data point was that the x-axis value (because it does not know the
> data is numeric)
> * Thus you point 1 with, say, values __id__ = 500, date=.... will get
> plotted at x-axis value 1 even if you choose x-axis to be __id__ or
> date
> * Because the data is not sorted you won't get a straight line even
> when plotting against __id__
>
> The fix for this which is definitely poor UX would be:
>
> * Ability to configure a default sort field
> * Better casting of data upon upload (so numeric data in csv would get
> recognized as such ...) - use of mapping in elasticsearch
> * Better casting of data in flot graph
>
> Rufus
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev



-- 
Mark Wainwright, CKAN Community Co-ordinator
Open Knowledge Foundation http://okfn.org/
Skype: m.wainwright




More information about the ckan-dev mailing list