[ckan-dev] Fwd: Issues using Geo plugins

David Read david.read at hackneyworkshop.com
Tue Aug 13 10:03:40 UTC 2013


Tiago,

Great to hear that helped.

The data.gov.uk harvester is currently set to:

ckan.spatial.validator.profiles = iso19139eden,constraints-1.4,gemini2-1.3
ckan.spatial.validator.reject = true

If you're not on the dgu branch of ckanext-spatial already then you'll
need to switch to it to get these validators and the second option.

However that still gives us 48 validation errors for spatialni.gov.uk
data. This is not uncommon - producing XML that is valid Gemini2
appears to be tripping up a lot of publishers. UKLP makes some effort
to help, but at the end of the day it is NI's responsibility.

It sounds like the datasets you see on data.gov.uk were harvested
before we switched on the 'rejection', which was April. Previous to
this, datasets were still harvested into CKAN even if they didn't
fully validate. You can do this with this config:

ckan.spatial.validator.reject = false

David

On 13 August 2013 10:32, Tiago Ribeiro <tribeiro.tb at gmail.com> wrote:
> Thank you all for your help. I have finally the harvester job running... but
> now I'm having another problem.
>
> I'm harvesting the catalog from an INSPIRE portal, namely this one:
> http://www.spatialni.gov.uk/geoportal/csw/discovery?Request=GetCapabilities&Service=CSW
>
> I've tried to set the validator like this "ckan.spatial.validator.profiles =
> inspire" in the config file but it doesn't work (KeyError: 'INSPIRE'), and
> using "ckan.spatial.validator.profiles = iso19139,gemini2,constraints" it
> gives validation error when reading the items...
>
> The data.gov.uk portal have harvested from the same source so I was
> wondering if there's any other profile validator that I should use?
>
> Thanks!
>
> Tiago
>
> 2013/8/12 David Read <david.read at hackneyworkshop.com>
>>
>> You need to create a Harvest Job for the particular Harvest Source
>> that you want to harvest from. You can do this with paster or in the
>> web interface.
>>
>> David
>>
>> On 12 August 2013 16:22, Tiago Ribeiro <tribeiro.tb at gmail.com> wrote:
>> > Thanks Ryan.
>> > I'm doing exactly what this links says:
>> > https://github.com/okfn/ckanext-harvest#running-the-harvest-jobs, but
>> > when I
>> > execute the run command on the third window it throws an exception:
>> > Exception: There are no new harvesting jobs.
>> >
>> > In one console I have:
>> > (default)ubuntu at ip-172-31-31-141:/etc/ckan/default$ paster
>> > --plugin=ckanext-harvest harvester gather_consumer
>> > --config=production.ini
>> > 2013-08-12 15:18:29,140 DEBUG [ckanext.harvest.model] Harvest tables
>> > defined
>> > in memory
>> > 2013-08-12 15:18:29,143 DEBUG [ckanext.harvest.model] Harvest tables
>> > already
>> > exist
>> > 2013-08-12 15:18:29,182 DEBUG [ckanext.harvest.queue] pika connection
>> > using
>> > {'retry_delay': 2.0, 'frame_max': 10000, 'channel_max': 0, 'locale':
>> > 'en_US', 'socket_timeout': 0.25, 'ssl': False, 'host': 'localhost',
>> > 'ssl_options': {}, 'virtual_host': '/', 'heartbeat': 0, 'credentials':
>> > <pika.credentials.PlainCredentials object at 0x40a6f10>,
>> > 'backpressure_detection': False, 'port': 5672, 'connection_attempts': 1}
>> > 2013-08-12 15:18:30,229 DEBUG [ckanext.harvest.queue] Gather queue
>> > consumer
>> > registered
>> >
>> > Then in other console:
>> > (default)ubuntu at ip-172-31-31-141:/etc/ckan/default$ paster
>> > --plugin=ckanext-harvest harvester fetch_consumer
>> > --config=production.ini
>> > 2013-08-12 15:19:16,846 DEBUG [ckanext.harvest.model] Harvest tables
>> > defined
>> > in memory
>> > 2013-08-12 15:19:16,850 DEBUG [ckanext.harvest.model] Harvest tables
>> > already
>> > exist
>> > 2013-08-12 15:19:16,892 DEBUG [ckanext.harvest.queue] pika connection
>> > using
>> > {'retry_delay': 2.0, 'frame_max': 10000, 'channel_max': 0, 'locale':
>> > 'en_US', 'socket_timeout': 0.25, 'ssl': False, 'host': 'localhost',
>> > 'ssl_options': {}, 'virtual_host': '/', 'heartbeat': 0, 'credentials':
>> > <pika.credentials.PlainCredentials object at 0x5e26c50>,
>> > 'backpressure_detection': False, 'port': 5672, 'connection_attempts': 1}
>> > 2013-08-12 15:19:17,938 DEBUG [ckanext.harvest.queue] Fetch queue
>> > consumer
>> > registered
>> >
>> > Finally in other console:
>> > (default)ubuntu at ip-172-31-31-141:/etc/ckan/default$ paster
>> > --plugin=ckanext-harvest harvester run --config=production.ini
>> > 2013-08-12
>> > 15:19:34,221 DEBUG [ckanext.harvest.model] Harvest tables defined in
>> > memory
>> > 2013-08-12 15:19:34,225 DEBUG [ckanext.harvest.model] Harvest tables
>> > already
>> > exist
>> > 2013-08-12 15:19:34,264 INFO  [ckanext.harvest.logic.action.update]
>> > Harvest
>> > job run: {}
>> > 2013-08-12 15:19:34,284 INFO  [ckanext.harvest.logic.action.update] No
>> > new
>> > harvest jobs.
>> > Traceback (most recent call last):
>> >   File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
>> >     load_entry_point('PasteScript==1.7.5', 'console_scripts',
>> > 'paster')()
>> >   File
>> >
>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> > line 104, in run
>> >     invoke(command, command_name, options, args[1:])
>> >   File
>> >
>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> > line 143, in invoke
>> >     exit_code = runner.run(args)
>> >   File
>> >
>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> > line 238, in run
>> >     result = self.command()
>> >   File
>> >
>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/commands/harvester.py",
>> > line 114, in command
>> >     self.run_harvester()
>> >   File
>> >
>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/commands/harvester.py",
>> > line 265, in run_harvester
>> >     jobs = get_action('harvest_jobs_run')(context,{})
>> >   File "/usr/lib/ckan/default/src/ckan/ckan/logic/__init__.py", line
>> > 356, in
>> > wrapped
>> >     result = _action(context, data_dict, **kw)
>> >   File
>> >
>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/logic/action/update.py",
>> > line 332, in harvest_jobs_run
>> >     raise Exception('There are no new harvesting jobs')
>> > Exception: There are no new harvesting jobs
>> >
>> > Am I missing something?
>> >
>> >
>> > 2013/8/12 Ryan Maine <balrogmi at msn.com>
>> >>
>> >> This is an example from the USA catalog
>> >>
>> >> http://catalog.data.gov/harvest/about/aasg-geothermal-data-csw
>> >>
>> >> There you can find more
>> >>
>> >> El 12/08/2013 15:57, "Tiago Ribeiro" <tribeiro.tb at gmail.com> escribió:
>> >>
>> >>> Thanks David, that did the trick!
>> >>>
>> >>> Does have/Can anyone provide me an example of a configuration object
>> >>> for
>> >>> a CSW harvester?
>> >>>
>> >>> Cheers,
>> >>> Tiago
>> >>>
>> >>> 2013/8/12 David Read <david.read at hackneyworkshop.com>
>> >>>>
>> >>>> Tiago,
>> >>>>
>> >>>> The README is wrong on this one. Try this:
>> >>>>
>> >>>> ckan.harvest.mq.type = ampq
>> >>>>
>> >>>> (And yes, this is AMQP with a typo)
>> >>>>
>> >>>> David
>> >>>>
>> >>>> On 12 August 2013 10:24, Tiago Ribeiro <tribeiro.tb at gmail.com> wrote:
>> >>>> > Err... thank you for both your answers but now it's working...
>> >>>> > don't
>> >>>> > know
>> >>>> > why!
>> >>>> > Just for reference I'm using CKAN 2.2a (latest).
>> >>>> >
>> >>>> > Now I have a different problem. When I run the command line to run
>> >>>> > the
>> >>>> > Harvest jobs I get this message:
>> >>>> > Traceback (most recent call last):
>> >>>> >   File "/usr/lib/ckan/default/bin/paster", line 9, in <module>
>> >>>> >     load_entry_point('PasteScript==1.7.5', 'console_scripts',
>> >>>> > 'paster')()
>> >>>> >   File
>> >>>> >
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> >>>> > line 104, in run
>> >>>> >     invoke(command, command_name, options, args[1:])
>> >>>> >   File
>> >>>> >
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> >>>> > line 143, in invoke
>> >>>> >     exit_code = runner.run(args)
>> >>>> >   File
>> >>>> >
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/local/lib/python2.7/site-packages/paste/script/command.py",
>> >>>> > line 238, in run
>> >>>> >     result = self.command()
>> >>>> >   File
>> >>>> >
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/commands/harvester.py",
>> >>>> > line 126, in command
>> >>>> >     consumer = get_fetch_consumer()
>> >>>> >   File
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py",
>> >>>> > line 340, in get_fetch_consumer
>> >>>> >     consumer =
>> >>>> > get_consumer('ckan.harvest.fetch','harvest_object_id')
>> >>>> >   File
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py",
>> >>>> > line 174, in get_consumer
>> >>>> >     connection = get_connection()
>> >>>> >   File
>> >>>> >
>> >>>> > "/usr/lib/ckan/default/src/ckanext-harvest/ckanext/harvest/queue.py",
>> >>>> > line 39, in get_connection
>> >>>> >     raise Exception('not a valid queue type %s' % backend)
>> >>>> > Exception: not a valid queue type rabbitmq
>> >>>> >
>> >>>> > In my CKAN config file I have:
>> >>>> > ckan.plugins = stats text_preview recline_preview datastore
>> >>>> > ckan.plugins = harvest ckan_harvester csw_harvester
>> >>>> > ckan.harvest.mq.type = rabbitmq
>> >>>> >
>> >>>> > I've installed RabbitMQ as the backend.
>> >>>> >
>> >>>> > NB: I'm pretty new with CKAN, so I'm sorry if I've missed something
>> >>>> > "basic"!
>> >>>> >
>> >>>> > Thanks!
>> >>>> >
>> >>>> > Tiago
>> >>>> >
>> >>>> >
>> >>>> > 2013/8/8 Ryan Maine <balrogmi at msn.com>
>> >>>> >>
>> >>>> >> Are you sure you installed harvester extension before?
>> >>>> >>
>> >>>> >> El 08/08/2013 17:59, "Adrià Mercader" <adria.mercader at okfn.org>
>> >>>> >> escribió:
>> >>>> >>
>> >>>> >>> Hi Tiago,
>> >>>> >>>
>> >>>> >>> This is bizarre, I can not reproduce it so we will need more
>> >>>> >>> details.
>> >>>> >>> Is there any message or stacktrace available (eg what is not
>> >>>> >>> found)?
>> >>>> >>> Does this happen when you navigate to
>> >>>> >>> http://yourinstance/harvest?
>> >>>> >>>
>> >>>> >>> What versions/branches of ckan/spatial/harvest are you using?
>> >>>> >>>
>> >>>> >>> Adrià
>> >>>> >>>
>> >>>> >>> On 7 August 2013 17:20, Tiago Ribeiro <tribeiro.tb at gmail.com>
>> >>>> >>> wrote:
>> >>>> >>> > Hi all,
>> >>>> >>> >
>> >>>> >>> > I'm having an issues with spatial harvester plugin. As soon as
>> >>>> >>> > I
>> >>>> >>> > add
>> >>>> >>> > the
>> >>>> >>> > "csw_harvester" to ckan.plugins the and navigate to the harvest
>> >>>> >>> > domain
>> >>>> >>> > I get
>> >>>> >>> > a 404.
>> >>>> >>> > ckan.plugins = harvest ckan_harvester - works
>> >>>> >>> > ckan.plugins = harvest ckan_harvester csw_harvester - get a
>> >>>> >>> > 404.
>> >>>> >>> >
>> >>>> >>> > Am I doing something wrong?
>> >>>> >>> > I have CKAN in a Ubuntu 12.04 Amazon box. I've used the
>> >>>> >>> > "Install
>> >>>> >>> > from
>> >>>> >>> > source" guide.
>> >>>> >>> >
>> >>>> >>> > If you could help me I would be much appreciated. If you need
>> >>>> >>> > more
>> >>>> >>> > details
>> >>>> >>> > let me know.
>> >>>> >>> >
>> >>>> >>> > Tiago Ribeiro
>> >>>> >>> >
>> >>>> >>> >
>> >>>> >>> > _______________________________________________
>> >>>> >>> > ckan-dev mailing list
>> >>>> >>> > ckan-dev at lists.okfn.org
>> >>>> >>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>>> >>> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>> >>> >
>> >>>> >>>
>> >>>> >>> _______________________________________________
>> >>>> >>> ckan-dev mailing list
>> >>>> >>> ckan-dev at lists.okfn.org
>> >>>> >>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>>> >>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>> >>>
>> >>>> >>
>> >>>> >> _______________________________________________
>> >>>> >> ckan-dev mailing list
>> >>>> >> ckan-dev at lists.okfn.org
>> >>>> >> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>>> >> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>> >>
>> >>>> >
>> >>>> >
>> >>>> > _______________________________________________
>> >>>> > ckan-dev mailing list
>> >>>> > ckan-dev at lists.okfn.org
>> >>>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>>> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>> >
>> >>>>
>> >>>> _______________________________________________
>> >>>> ckan-dev mailing list
>> >>>> ckan-dev at lists.okfn.org
>> >>>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ckan-dev mailing list
>> >>> ckan-dev at lists.okfn.org
>> >>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>>
>> >>
>> >> _______________________________________________
>> >> ckan-dev mailing list
>> >> ckan-dev at lists.okfn.org
>> >> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> >> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >>
>> >
>> >
>> > _______________________________________________
>> > ckan-dev mailing list
>> > ckan-dev at lists.okfn.org
>> > http://lists.okfn.org/mailman/listinfo/ckan-dev
>> > Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>> >
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> http://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>




More information about the ckan-dev mailing list