[ckan-dev] Data viewer oddities

Rufus Pollock rufus.pollock at okfn.org
Mon Apr 16 12:27:33 UTC 2012


On 16 April 2012 10:14, Mark Wainwright <mark.wainwright at okfn.org> wrote:
> There seem to be a few bugs in the data viewer, which I had a quick
> play with when Jonathan Gray asked via Twitter: "how can I reload a
> graph? E.g. if I want to see date on the x axis for the gold example?
> http://thedatahub.org/dataset/gold-prices"
>
> (i) I believe the answer to J's question should be that it
> automatically reloads when you choose a new variable, and I believe
> also that it actually does this; but I suspect that Jonathan is not
> seeing it happen because the labelling of the axes is completely
> wrong, and doesn't change when he changes the variable from '__id__'
> to 'date' (the shape of the graph is the same).
>
> But besides the axis-labelling problems, there are other problems with this.
>
> (ii) Try plotting 'date' against '__id__'. The graph should be a
> straight line, but it is blank.
>
> (iii) Most strangely, try plotting __id__ against __id__. The result
> is a graph that wiggles up and down in a mystifying way - it is not
> the prices data, but nor is it the straight line one would naturally
> expect!

Have you *sorted* the data? By default the data does not come back in
sorted form. This combined with a "pseudo-bug" regarding the handling
of numeric data which is in string form will result in all the
behaviour you mention:

* Because the data was imported as a CSV all values were interpreted
as *strings* when I uploaded the data
* As such when plotting the graph Flot will simply use index of the
data point was that the x-axis value (because it does not know the
data is numeric)
* Thus you point 1 with, say, values __id__ = 500, date=.... will get
plotted at x-axis value 1 even if you choose x-axis to be __id__ or
date
* Because the data is not sorted you won't get a straight line even
when plotting against __id__

The fix for this which is definitely poor UX would be:

* Ability to configure a default sort field
* Better casting of data upon upload (so numeric data in csv would get
recognized as such ...) - use of mapping in elasticsearch
* Better casting of data in flot graph

Rufus




More information about the ckan-dev mailing list