[ckan-dev] problem with package importer

David Read david.read at okfn.org
Fri Dec 17 13:58:46 UTC 2010


Thomas,

It's tripping up this time because the package name shouldn't have
spaces in. It looks like there is no error checking for this in the
script. (See http://ckan.net/package/new for the rules on the name
field.) This loader doesn't have tests, so nothing is guaranteed I'm
afraid.

I can see some error handling in the loader, so I don't know why you
didn't get an exception for the 400 error, without running your code
with your data.

David

On 17 December 2010 13:30, Scheel, Thomas
<thomas.scheel at fokus.fraunhofer.de> wrote:
> Hello,
>
> I was able to verify, why the script is not able to read the content-type.
> Here is the answer-package from the ckan-service captured with wireshark:
>
> <head>
> <title>Error response</title>
> </head>
> <body>
> <h1>Error response</h1>
> <p>Error code 400.
> <p>Message: Bad request syntax ('GET /api/rest/package/amt far statistik berlin-brandenburg HTTP/1.1').
> <p>Error code explanation: 400 = Bad request syntax or unsupported method.
> </body>
>
> I think there are two problems.
> The first thing is, that the spreadsheetLoader doesn't convert the data from the spreadsheet to an url conform string? (spaces to %20).
> Second problem, that the script doesn't handle the http 400 error and try to read the content-type.
>
> Could this be the problem?
> My ckan version is 1.3.1a and I also updated the ckan client.
>
> Thomas
>
> -----Ursprüngliche Nachricht-----
> Von: ckan-dev-bounces at lists.okfn.org [mailto:ckan-dev-bounces at lists.okfn.org] Im Auftrag von David Read
> Gesendet: Freitag, 17. Dezember 2010 11:08
> An: Scheel, Thomas; CKAN Development Discussions
> Betreff: Re: [ckan-dev] problem with package importer
>
> Thomas,
>
> Maybe someone else can, but I can't see why this is tripping up here.
> It's getting a package and missing the 'content-type' response header,
> although it looks like it is a 200 OK response.
>
> From the line numbers I notice you're not using the latest or
> metastable version of ckanclient - which one are you using and can you
> try the latest? Also worth checking what version of ckan you're on.
>
> I suggest that in the line before it stops in ckanclient/__init__.py
> you put a print self.last_body and self.last_headers and see where the
> content-type header is getting to. Do let us know how this goes.
>
> David
>
> On 16 December 2010 15:43, Scheel, Thomas
> <thomas.scheel at fokus.fraunhofer.de> wrote:
>> Hello *,
>>
>> first thanks for the help with the data import from spreadsheets.
>> I made a customized import script which inherits from SimpleGoogleSpreadsheetLoader.
>> When I run the script (with --ckan-api-location=http://127.0.0.1:5000/api --ckan-api-key=xxx --google-spreadsheet-key=xxx --google-email=xxx --google-password=xxx)
>> It reads the fields of the spreadsheet an exits with:
>>
>> Reading Google spreadsheet. Please wait...
>> Working area of spreadsheet: top-left (1, 1); bottom-right (6, 10).
>> There are 10 headings: unique_id, title, tags, notes, url, resources-0-url, resource-0-format, author, maintainer, license
>> There are 3 entities: amt far statistik berlin-brandenburg, bundesgesetztracker, auslanderstatistik
>> There are 3 metadata packages with titles extracted from the spreadsheet.
>> Putting 3 packages on CKAN running at http://127.0.0.1:5000/api
>>
>> Traceback (most recent call last):
>>  File "/home/ckanuser/ckanext/bin/ckanload-aidprojdata", line 29, in <module>
>>    AidProjectsLoader().run()
>>  File "/home/ckan-user/pyenv/src/ckanclient/ckanclient/loaders/base.py", line 105, in run
>>    self.put_packages_on_ckan()
>>  File "/home/ckanuser/pyenv/src/ckanclient/ckanclient/loaders/base.py", line 121, in put_packages_on_ckan
>>    registered_package = self.ckanclient.package_entity_get(package['name'])
>>  File "/home/ckanuser/pyenv/src/ckanclient/ckanclient/__init__.py", line 322, in package_entity_get
>>    self.open_url(url)
>>  File "/home/ckanuser/pyenv/src/ckanclient/ckanclient/__init__.py", line 288, in open_url
>>    result = super(CkanClient, self).open_url(url, *args, **kwargs)
>>  File "/home/ckanuser/pyenv/src/ckanclient/ckanclient/__init__.py", line 189, in open_url
>>    content_type = self.last_headers['Content-Type']
>>  File "/usr/lib/python2.6/rfc822.py", line 388, in __getitem__
>>    return self.dict[name.lower()]
>> KeyError: 'content-type'
>>
>> Does anybody know what the problem is? For me, it is very strange, bacause the script works fine for several imports and I already load some data in the database. Now it seems that the request doesn't arrive at the server side.
>>
>> As python rookie I would be very thankful for any useful hint.
>>
>> Many thanks in advance
>>
>> Thomas Scheel
>>
>> -----Ursprüngliche Nachricht-----
>> Von: friedrich.lindenberg at gmail.com [mailto:friedrich.lindenberg at gmail.com] Im Auftrag von Friedrich Lindenberg
>> Gesendet: Donnerstag, 9. Dezember 2010 12:07
>> An: rufus.pollock at okfn.org; CKAN Development Discussions
>> Cc: Scheel, Thomas
>> Betreff: Re: [ckan-dev] problem with package importer
>>
>> Hi Thomas,
>>
>> On Thu, Dec 9, 2010 at 10:35 AM, Rufus Pollock <rufus.pollock at okfn.org> wrote:
>>>> Another thing I noticed, if I'm downloading a data package via datapkg
>>>> download ckan://iso-3166-2-data . like the example in the manual, it is
>>>> working fine. But when I try to download other packages, especially in the
>>>> german ckan version (de.ckan.org) I always get exceptions while reading
>>>> information (datapkg info) or downloading the package. Is the german version
>>>> incompatible to datapkg or strikes the download while reading faulty or
>>>> incomplete records?
>>>
>>> How are you trying to search offenedaten.de. At the moment (we are
>>> changing this as more CKAN instances appear) you will have to set
>>> ckan.url in your [index:ckan] section to:
>>>
>>> ckan.url = http://offenedaten.de
>>>
>>> I was then able to successfully query e.g.:
>>>
>>> $ datapkg search ckan:// statistik
>>> $ datapkg info ckan://destatis-statistik-21411
>>> $ datapkg download ckan://destatis-statistik-21411 .
>>> # there will now be a /tmp/destatis-statistik-21411 directory
>>>
>>> I note many offenedaten.de packages have no download resources in
>>> which case datapkg download will exit silently and do nothing. This
>>> should probably be made more apparent (many offenedaten packages have
>>> no download resources at the moment!).
>>
>> It is also worth noting that offenedaten/de.ckan is based on an
>> outdated and somewhat unstable version CKAN. Daniel and I should
>> invest some time into upgrading this while keeping plugins and theming
>> intact.
>>
>> Also: why is it that datapkg will only download the first listed
>> resource, is there anything speaking against an "--all" switch?
>>
>> - Friedrich
>>
>>
>>>
>>> One way to see this is to run in verbose mode:
>>>
>>> datapkg download --verbose ... ...
>>>
>>> Rufus
>>>
>>> _______________________________________________
>>> ckan-dev mailing list
>>> ckan-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
>




More information about the ckan-dev mailing list