[ckan-discuss] Wrong type mapping
David Raznick
kindly at gmail.com
Sun Jun 3 22:34:36 BST 2012
Hello,
I had a quick look at your file.
You where correct about the type inferences if there are ",". You
seem to have missed out on removing a few though on around line 370 in
your file.
i.e http://thedatahub.org/api/data/3b961dcf-f2fc-4425-8c07-159a58557bc9/369
I removed the last few commas from the fiile and tried to upload it
again and is seems to have worked.
http://thedatahub.org/dataset/copa-2014/resource/ad90de8a-17c7-4576-a6a5-cc8c68b61f89
http://thedatahub.org/api/data/ad90de8a-17c7-4576-a6a5-cc8c68b61f89/_mapping
We have to be very strict about our inference or elastic search will
blow up on us. We fall back to text whenever there is a little bit of
uncertainty.
Thanks
David
On Sun, Jun 3, 2012 at 9:39 PM, Alexandre Gomes <alegomes at gmail.com> wrote:
> How can I force CKAN to map a CSV data column to numeric (float or double)
> instead of string? I think this is the cause of the error below when trying
> to use statistical facet on ElasticSearch [0].
>
> {
>
> "error" : "SearchPhaseExecutionException[Failed to execute phase [query],
> total failure; shardFailures
> {[xyJEmo2iRhuhB75ZsN6kWQ][ckan-www.ckan.net][3]:
> RemoteTransportException[[du Paris,
> Bennet][inet[/193.34.146.144:9300]][search/phase/query]]; nested:
> QueryPhaseExecutionException[[ckan-www.ckan.net][3]:
> query[filtered(ConstantScore(*:*))->FilterCacheFilterWrapper(_type:3b961dcf-f2fc-4425-8c07-159a58557bc9)],from[0],size[10]:
> Query Failed [Failed to execute main query]]; nested:
> ClassCastException[org.elasticsearch.index.field.data.strings.SingleValueStringFieldData
> cannot be cast to org.elasticsearch.index.field.data.NumericFieldData];
> }{[dHahbQPkR2SgYA8Mp5JbBQ][ckan-www.ckan.net][4]:
> QueryPhaseExecutionException[[ckan-www.ckan.net][4]:
> query[filtered(ConstantScore(*:*))->FilterCacheFilterWrapper(_type:3b961dcf-f2fc-4425-8c07-159a58557bc9)],from[0],size[10]:
> Query Failed [Failed to execute main query]]; nested:
> ClassCastException[org.elasticsearch.index.field.data.strings.SingleValueStringFieldData
> cannot be cast to org.elasticsearch.index.field.data.NumericFieldData]; }{[
>
>
> First, I uploaded a new CSV resource [1] to the World Cup 2014 dataset [2]
> and currency fields was mapped as strings [3]:
>
> "Investimento-Previsto-para-a-Etapa" : {
> "type" : "string"
> },
> (...)
> "Investimento-Contratado-para-a-Etapa" : {
> "type" : "string"
> },
> "Investimento-Executado-para-a-Etapa" : {
> "type" : "string"
> },
>
>
> Then, imagining the use of comma as decimal separator (i.e. 84,55668315)
> could be misleading CKAN in the type inference, I re-submitted the CSV file
> as a new resource [4] fixing the numbering format (84,55668315
> to 84.55668315), but the wrong type mapping persisted.
>
> "Investimento-Previsto-para-a-Etapa" : {
> "type" : "string"
> },
> (...)
> "Investimento-Contratado-para-a-Etapa" : {
> "type" : "string"
> },
> "Investimento-Executado-para-a-Etapa" : {
> "type" : "string"
> },
>
>
> So, I tried to use de "Transform" option available at the data table column
> action button [4], using the script bellow
>
> function(doc) {
> doc['Investimento-Previsto-para-a-Etapa'] =
> parseFloat(doc['Investimento-Previsto-para-a-Etapa']);
> return doc;
> }
>
> but, after a while waiting for the message "Updating all visible docs. This
> could take a while..." to disappear, an alert message showed up saying
> something like "We have only updated the docs in this view. Update of all
> docs not yet implemented".
>
> Ideas on how to make those three fields as numbers?
>
> thanks
>
> [0] http://thedatahub.org/api/data/3b961dcf-f2fc-4425-8c07-159a58557bc9/_search?pretty=true&source={%22query%22:{%22match_all%22:{}},%22facets%22:{%22totais%22:{%22statistical%22:{%22field%22:%22Investimento-Previsto-para-a-Etapa%22}}}}
> [1] http://thedatahub.org/dataset/copa-2014/resource/075de5b0-19ba-45fb-bfaa-603a78c47d45
> [2] http://thedatahub.org/dataset/copa-2014
> [3] http://thedatahub.org/api/data/075de5b0-19ba-45fb-bfaa-603a78c47d45/_mapping?pretty=true
> [4] http://thedatahub.org/dataset/copa-2014/resource/3b961dcf-f2fc-4425-8c07-159a58557bc9
> [5] http://thedatahub.org/api/data/3b961dcf-f2fc-4425-8c07-159a58557bc9/_mapping?pretty=true
>
> http://www.elasticsearch.org/guide/reference/mapping/core-types.html
>
>
> []s!
>
> _______________________________________________
> ckan-discuss mailing list
> ckan-discuss at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-discuss
>
More information about the ckan-discuss
mailing list