[ckan-dev] CSW harvester - getting started?

Adrià Mercader adria.mercader at okfn.org
Wed Feb 3 10:11:13 UTC 2016


Derek,

Your input looks fine, the configuration field is not used on the CSW
harvester so that will be ignored. This should work:


URL: http://data.linz.govt.nz
<http://data.linz.govt.nz/feeds/csw?service=CSW&version=2.0.2&request=GetCapabilities>
Title: LINZ Data Service
Source: CSW Server
Update Frequency: Manual
Organisation: linz

Now, this will create a harvest source for you. When you click "Reharvest"
in the Admin page a new job will be created for you.

Before crating the job you need to have the two necessary processes (gather
and fetch) running for this to work. The easiest way to do that is to open
a up a couple of terminals and start the processes:

paster --plugin=ckanext-harvest harvester gather_consumer --config=mysite.ini

paster --plugin=ckanext-harvest harvester fetch_consumer --config=mysite.ini


You might need to run the run command to fire the jobs again:

paster --plugin=ckanext-harvest harvester run --config=mysite.ini


Have a read at these docs to make sure you have everything set up properly:


https://github.com/ckan/ckanext-harvest#running-the-harvest-jobs


Hope this helps,

Adrià


On 3 February 2016 at 07:40, Derek Hohls <dhohls at csir.co.za> wrote:

> I appreciate that; but it looks like the output created by a harvesting
> process;
> I need to see what the inputs are that generate the output and what the
> steps
> are to make it "go".  Right now I still only have the same display as
> listed in
> my original post.
>
> Thanks
> Derek
>
> >>> John Jediny - XAAB <john.jediny at gsa.gov> 02/02/16 6:18 PM >>>
> Here is an example harvesting a geonode/pyCSW endpoint:
> http://dev.openei.org/datasets/harvest/nepa-node
>
> here is the endpoint harvested:
> http://dev.openei.org/datasets/harvest/about/nepa-node
>
> On Tue, Feb 2, 2016 at 5:10 AM, Derek Hohls <dhohls at csir.co.za> wrote:
>
>> Hi all,
>>
>> Can anyone provide a simple link to an actual working example of using
>> the CSW harvester; there are detailed instructions on what all the
>> parameters mean on their git site, but this has not helped me to get an
>> instance working.
>>
>> I have tried through the CKAN web interface (http://localhost/harvest/)
>> to set up a test example:
>>
>> URL:
>> http://data.linz.govt.nz/feeds/csw?service=CSW&version=2.0.2&request=GetCapabilities
>> Title: LINZ Data Service
>> Source: CSW Server
>> Update Frequency: Manual
>> Configuration: { "remote_groups": "create", "remote_orgs": "create",
>> "user":"derek", "private_datasets": false}
>> Organisation: linz
>>
>> The URL above is one which works and can be tested from a browser.  (I
>> have also tried with the  Configuration setting empty.)
>>
>> I have tried the Reharvest as well, but the only output that ever shows
>> on the "Jobs" tab is:
>>
>> Harvest Jobs
>>     Job: dfa53fbc-4c11-421c-9c7d-b1e768c2785d Running
>>     Started: Not yet ? Finished: Not yet
>>
>> I am sure all of this is obvious to experienced users, but it does not
>> seem so simple for beginners.
>>
>> Any advice, insights or suggestions would be appreciated.
>>
>> Thanks,
>> Derek
>>
>>
>> --
>> This message is subject to the CSIR's copyright terms and conditions,
>> e-mail legal notice, and implemented Open Document Format (ODF) standard.
>> The full disclaimer details can be found at
>> http://www.csir.co.za/disclaimer.html.
>>
>>
>> This message has been scanned for viruses and dangerous content by
>> *MailScanner* <http://www.mailscanner.info/>,
>> and is believed to be clean.
>>
>>
>> Please consider the environment before printing this email.
>>
>>
>> _______________________________________________
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
>> https://lists.okfn.org/mailman/listinfo/ckan-dev
>> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>>
>>
>
>
> --
> Chief Data Engineer
> 202-341-0191
> @Data.gov
> @Office of Citizen Science and Innovative Technologies/18F
> <http://www.gsa.gov/portal/category/25729>
> General Services Administration
>
> Work in the Open... ideate, innovate, iterate...
> @github <https://github.com/JJediny> | @projectopendata
> <https://github.com/project-open-data>
>
> --
> This message is subject to the CSIR's copyright terms and conditions,
> e-mail legal notice, and implemented Open Document Format (ODF) standard.
> The full disclaimer details can be found at
> http://www.csir.co.za/disclaimer.html.
>
>
> This message has been scanned for viruses and dangerous content by
> *MailScanner* <http://www.mailscanner.info/>,
> and is believed to be clean.
>
>
> Please consider the environment before printing this email.
>
> --
> This message is subject to the CSIR's copyright terms and conditions,
> e-mail legal notice, and implemented Open Document Format (ODF) standard.
> The full disclaimer details can be found at
> http://www.csir.co.za/disclaimer.html.
>
>
> This message has been scanned for viruses and dangerous content by
> *MailScanner* <http://www.mailscanner.info/>,
> and is believed to be clean.
>
>
> Please consider the environment before printing this email.
>
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20160203/c3344f59/attachment-0003.html>


More information about the ckan-dev mailing list