[ckan-dev] Problems with file upload by code on Apache

Florian.Brucker at mb.karlsruhe.de Florian.Brucker at mb.karlsruhe.de
Thu Jun 2 08:42:10 UTC 2016


After some more debugging I found out that there is a problem with how CKAN
constructs URLs when it's not run directly under the server root. I've
filed an issue: https://github.com/ckan/ckan/issues/3081


Regards
Florian



"ckan-dev" <ckan-dev-bounces at lists.okfn.org> schrieb am 31.05.2016
12:27:30:

> Von: Florian.Brucker at mb.karlsruhe.de
> An: ckan-dev at lists.okfn.org,
> Datum: 31.05.2016 12:27
> Betreff: [ckan-dev] Problems with file upload by code on Apache
> Gesendet von: "ckan-dev" <ckan-dev-bounces at lists.okfn.org>
>
> Hello everybody,
>
> I'm trying to create a new resource (including an uploaded file) via
> Python code. The use case is a custom harvester, but my problem seems
> to be independent from ckanext-harvest.
>
> Here's some code that shows the problem:
>
> -----------------------------
> #!/usr/bin/env python
>
> import cgi
> import os.path
> import urllib2
>
> import paste.deploy
> from paste.registry import Registry
> import pylons
>
> from ckan.config.environment import load_environment
> import ckan.plugins.toolkit as toolkit
> from ckan.lib.cli import MockTranslator
> from ckan.model import User
>
>
> # Adapted from ckan.lib.cli.CkanCommand._load_config
> def load_config(ini_path):
>     ini_path = os.path.abspath(ini_path)
>     conf = paste.deploy.appconfig('config:' + ini_path)
>     load_environment(conf.global_conf, conf.local_conf)
>
>     registry = Registry()
>     registry.prepare()
>     registry.register(pylons.translator, MockTranslator())
>
>     registry.register(pylons.c, pylons.util.AttribSafeContextObj())
>     user = toolkit.get_action('get_site_user')({'ignore_auth': True}, {})
>     pylons.c.user = user['name']
>     pylons.c.userobj = User.get(user['name'])
>
>
> def create_resource(f, pkg_id, name):
>     upload = cgi.FieldStorage()
>     upload.filename = getattr(f, 'name', 'data')
>     upload.file = f
>     data_dict = {
>         'package_id': pkg_id,
>         'name': name,
>         'upload': upload,
>         'url': 'unused-but-required',
>     }
>     return toolkit.get_action('resource_create')({}, data_dict)
>
>
> if __name__ == '__main__':
>     import sys
>     import StringIO
>     load_config(sys.argv[1])
>
>     PKG_ID = 'bde56c8d-c9fa-47ad-8efb-9917e6751027'
>
>     fake_file = StringIO.StringIO('1,2,3')
>     fake_file.name = 'data.csv'
>
>     res_dict = create_resource(fake_file, PKG_ID, 'My Resource')
>     print(res_dict['url'])
>     try:
>         c = urllib2.urlopen(res_dict['url'])
>     except urllib2.HTTPError as e:
>         print(e)
>     else:
>         print c.getcode()
> -----------------------------
>
> When I run that code against my development.ini (which uses paster
> serve) then it works: The resource is created, the data is uploaded,
> the URL is updated, and the file can then be downloaded from the
> updated URL:
>
> -----------------------------
> $ sudo -u www-data /usr/lib/ckan/default/bin/python upload_test.py /
> etc/ckan/default/development.ini
> /usr/lib/ckan/default/local/lib/python2.7/site-packages/sqlalchemy/
> orm/unitofwork.py:79: SAWarning: Usage of the 'related attribute
> set' operation is not currently supported within the execution stage
> of the flush process. Results may not be consistent.  Consider using
> alternative event listeners or connection-level operations instead.
>   sess._flush_warning("related attribute set")
> http://172.16.16.17:5000/dataset/bde56c8d-
> c9fa-47ad-8efb-9917e6751027/resource/2d927d5d-05e3-487f-
> a61b-851e1000be64/download/data.csv
> 200
> -----------------------------
>
> However, if I run the code against my production.ini (which uses
> Apache and sets debug = false, but is otherwise equal to
> development.ini) then the final step (downloading the resource from the
> updated URL) fails with a 404:
>
> -----------------------------
> $ sudo -u www-data /usr/lib/ckan/default/bin/python upload_test.py /
> etc/ckan/default/production.ini
> /usr/lib/ckan/default/local/lib/python2.7/site-packages/sqlalchemy/
> orm/unitofwork.py:79: SAWarning: Usage of the 'related attribute
> set' operation is not currently supported within the execution stage
> of the flush process. Results may not be consistent.  Consider using
> alternative event listeners or connection-level operations instead.
>   sess._flush_warning("related attribute set")
> http://172.16.16.17:9000/dataset/bde56c8d-
> c9fa-47ad-8efb-9917e6751027/resource/827617f6-a009-4ab6-
> a4f2-34a983d08541/download/data.csv
> HTTP Error 404: Not Found
> -----------------------------
>
> The resource itself is successfully created and is displayed in the web
> UI (obviously a download via the web UI also fails with a 404).
>
> The resource file is there and has the proper permissions:
>
> -----------------------------
> $ ls -l /var/lib/ckan/resources/827/617/f6-a009-4ab6-a4f2-34a983d08541
> -rw-r--r-- 1 www-data www-data 5 May 31 11:44 /var/lib/ckan/
> resources/827/617/f6-a009-4ab6-a4f2-34a983d08541
> -----------------------------
>
> Interestingly, if I edit the resource in the web UI and submit the form
> without making any changes then the download starts working! I've
> compared the output of resource_show before and after the edit, and the
> resource itself hasn't changed.
>
> Similarly, submitting the resource's *package* edit form without any
> changes also makes the download start working.
>
> However, faking the fake resource edit by passing the dict returned
> from resource_create to resource_update does *not* fix the download
> problem.
>
> I didn't find anything interesting in the log files, and I'm out of
> ideas on how to debug this any further.
>
>
> Regards,
> Florian
> --
> Stadt Karlsruhe, Medienbüro
> Tel: 0721-133-1884
>
florian.brucker at mb.karlsruhe.de_______________________________________________

> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20160602/d4984582/attachment-0002.html>


More information about the ckan-dev mailing list