[openspending-dev] Error trying to use a composite classifier in unique_keys

David Cabo david.cabo at gmail.com
Mon Nov 28 21:46:22 UTC 2011


 Hi,

 I did, but retried now to be sure. It says (full trace attached):

File "/home/okfn/var/srvc/
sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/loader.py",
line 173, in _entry_unique_values
    raise KeyError("Unique key %s missing from entry: %s" % (k, entry))
KeyError: u"Unique key programme.id missing from entry

 I also tried programme.name, just in case, and that raises the SHA error:

File "/home/okfn/var/srvc/
sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/util.py",
line 76, in <genexpr>
    return sha1(''.join(sha1(val).hexdigest() for val in
iterable)).hexdigest()
TypeError: sha1() argument 1 must be string or read-only buffer, not dict

 cheers,

/david


On 28 November 2011 22:34, Friedrich Lindenberg <
friedrich.lindenberg at okfn.org> wrote:

> Hi,
>
> have you tried using "programme.id" instead of "programme" in the
> unique_keys?
>
> - Friedrich
>
> On Mon, Nov 28, 2011 at 10:24 PM, David Cabo <david.cabo at gmail.com> wrote:
> >  Hi all,
> >  I ran into an issue while importing Spanish budget data [1] into the
> > Sandbox OpenSpending today. I was trying to set 'unique_keys' in the
> model
> > mapping file to ["year", "programme"], where 'programme' is a composite
> > classifier with two fields
> >       "fields" : [
> >         {"column" : "programme.id", "datatype" : "id", "name" : "name"},
> >         {"column" : "programme.label", "datatype" : "string", "name" :
> > "label"}
> >       ]
> >  But the importer kept failing, complaining that the SHA key generation
> > function expected a string/buffer, not a dictionary. (Sorry, don't have a
> > copy of the stack trace, but could reproduce it if needed.) I've worked
> > around this by adding an extra column with a unique id for each entry,
> but
> > it's not ideal. Is this meant to be supported, or is it a bug?
> >  regards,
> > /david
> > [1]: http://thedatahub.org/dataset/spain-national-budget-2008-2011
> >
> > _______________________________________________
> > openspending-dev mailing list
> > openspending-dev at lists.okfn.org
> > http://lists.okfn.org/mailman/listinfo/openspending-dev
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20111128/9945da6f/attachment.html>
-------------- next part --------------
2011-11-28 21:45:09 INFO: Validating model
2011-11-28 21:45:09 INFO: Describing dimensions
Traceback (most recent call last):
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/bin/openspendingetld", line 9, in <module>
    load_entry_point('openspending.etl==0.8dev', 'console_scripts', 'openspendingetld')()
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/command/daemon.py", line 100, in main
    run_job(*args)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/command/daemon.py", line 149, in run_job
    t(*args)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/tasks.py", line 17, in ckan_import
    importer.run(**opts)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/base.py", line 96, in run
    self.process_line(line)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/base.py", line 208, in process_line
    self.import_line(_line)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/csv.py", line 51, in import_line
    self.loader.create_entry(**entry)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/loader.py", line 230, in create_entry
    entry_id = util.hash_values(entry_uniques)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/util.py", line 76, in hash_values
    return sha1(''.join(sha1(val).hexdigest() for val in iterable)).hexdigest()
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/util.py", line 76, in <genexpr>
    return sha1(''.join(sha1(val).hexdigest() for val in iterable)).hexdigest()
TypeError: sha1() argument 1 must be string or read-only buffer, not dict
-------------- next part --------------
2011-11-28 21:41:08 INFO: Validating model
2011-11-28 21:41:08 INFO: Describing dimensions
Traceback (most recent call last):
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/bin/openspendingetld", line 9, in <module>
    load_entry_point('openspending.etl==0.8dev', 'console_scripts', 'openspendingetld')()
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/command/daemon.py", line 100, in main
    run_job(*args)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/command/daemon.py", line 149, in run_job
    t(*args)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/tasks.py", line 17, in ckan_import
    importer.run(**opts)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/base.py", line 96, in run
    self.process_line(line)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/base.py", line 208, in process_line
    self.import_line(_line)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/importer/csv.py", line 51, in import_line
    self.loader.create_entry(**entry)
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/loader.py", line 229, in create_entry
    entry_uniques.extend(self._entry_unique_values(entry))
  File "/home/okfn/var/srvc/sandbox.openspending.org/pyenv/src/openspending.etl/.packageroot/openspending/etl/loader.py", line 173, in _entry_unique_values
    raise KeyError("Unique key %s missing from entry: %s" % (k, entry))
KeyError: u"Unique key programme.id missing from entry: {'programme-group': {u'name': u'141', u'views': {u'default': {u'label': u'Breakdown by programme', u'drilldown': u'programme', u'dataset': u'spain-budget', u'dimension': u'programme-group', u'cuts': {}}}, u'taxonomy': u'programme-group', u'label': u'Administracio_n general de relaciones exteriores', u'_id': ObjectId('4ed3f079ab41d83827000006'), 'ref': DBRef('classifier', ObjectId('4ed3f079ab41d83827000006'))}, 'from': {u'_id': ObjectId('4ed3d8f42cee9bea8e391dad'), 'ref': DBRef('entity', ObjectId('4ed3d8f42cee9bea8e391dad')), u'description': u'', u'name': u'ac', u'label': u'Administraci\\xf3n Central'}, 'entities': [ObjectId('4ed3d8f42cee9bea8e391dad'), ObjectId('4eca10b92cee9bea8e387cfb')], 'provenance': {'timestamp': datetime.datetime(2011, 11, 28, 21, 41, 9, 21594), 'line': 1, 'source_file': u'https://commondatastorage.googleapis.com/ckannet-storage/2011-11-28T191522/pge-2008-2011.csv', 'dataset': u'spain-budget'}, 'classifiers': [ObjectId('4ed3f079ab41d83827000006'), ObjectId('4ed3f079ab41d83827000007'), ObjectId('4ed3de55ab41d837a3000006'), ObjectId('4ed3f079ab41d83827000008')], 'currency': u'EUR', 'to': {u'_id': ObjectId('4eca10b92cee9bea8e387cfb'), 'ref': DBRef('entity', ObjectId('4eca10b92cee9bea8e387cfb')), u'description': u'', u'name': u'society', u'label': u'Society'}, 'amount': 89674.050000000003, '_csv_import_fp': u'spain-budget:https://commondatastorage.googleapis.com/ckannet-storage/2011-11-28T191522/pge-2008-2011.csv:1', 'time': {'to': {'month': '201112', 'parsed': datetime.datetime(2011, 12, 31, 0, 0), 'day': '20111231', 'year': '2011'}, 'unparsed': u'2011', 'from': {'month': '201101', 'parsed': datetime.datetime(2011, 1, 1, 0, 0), 'day': '20110101', 'year': '2011'}}, 'policy': {u'name': u'14', u'views': {u'default': {u'label': u'Breakdown by programme group', u'drilldown': u'programme-group', u'dataset': u'spain-budget', u'dimension': u'policy', u'cuts': {}}}, u'taxonomy': u'policy', u'label': u'POLI_TICA EXTERIOR', u'_id': ObjectId('4ed3f079ab41d83827000007'), 'ref': DBRef('classifier', ObjectId('4ed3f079ab41d83827000007'))}, 'cofog_1': {u'taxonomy': u'cofog', u'_id': ObjectId('4ed3de55ab41d837a3000006'), 'ref': DBRef('classifier', ObjectId('4ed3de55ab41d837a3000006')), u'name': u'1', u'label': u'Servicios p\\xfablicos generales'}, 'id': u'1', 'programme': {u'taxonomy': u'programme', u'_id': ObjectId('4ed3f079ab41d83827000008'), 'ref': DBRef('classifier', ObjectId('4ed3f079ab41d83827000008')), u'name': u'141m', u'label': u'Direcci\\xf3n y Servicios Generales de Asuntos Exteriores'}}"


More information about the openspending-dev mailing list