[ckan-dev] CSW harvester - getting started?

Derek Hohls dhohls at csir.co.za
Sun Feb 7 15:55:01 UTC 2016


Thanks Adrià 

I missed this mail which would have saved me much frustration; thanks for the very clear explanation!

Derek


>>> Adrià Mercader <adria.mercader at okfn.org> 02/03/16 12:14 PM >>>
Derek,


Your input looks fine, the configuration field is not used on the CSW harvester so that will be ignored. This should work:


URL: http://data.linz.govt.nz
Title: LINZ Data Service
Source: CSW Server
Update Frequency: Manual
Organisation: linz


Now, this will create a harvest source for you. When you click "Reharvest" in the Admin page a new job will be created for you.


Before crating the job you need to have the two necessary processes (gather and fetch) running for this to work. The easiest way to do that is to open a up a couple of terminals and start the processes:

paster --plugin=ckanext-harvest harvester gather_consumer --config=mysite.ini

paster --plugin=ckanext-harvest harvester fetch_consumer --config=mysite.ini


You might need to run the run command to fire the jobs again:

paster --plugin=ckanext-harvest harvester run --config=mysite.ini

Have a read at these docs to make sure you have everything set up properly:


https://github.com/ckan/ckanext-harvest#running-the-harvest-jobs



Hope this helps,


Adrià







On 3 February 2016 at 07:40, Derek Hohls <dhohls at csir.co.za> wrote:
I appreciate that; but it looks like the output created by a harvesting process;
I need to see what the inputs are that generate the output and what the steps
are to make it "go".  Right now I still only have the same display as listed in
my original post.

Thanks
Derek

>>> John Jediny - XAAB <john.jediny at gsa.gov> 02/02/16 6:18 PM >>>
Here is an example harvesting a geonode/pyCSW endpoint:http://dev.openei.org/datasets/harvest/nepa-node


here is the endpoint harvested:
http://dev.openei.org/datasets/harvest/about/nepa-node



On Tue, Feb 2, 2016 at 5:10 AM, Derek Hohls <dhohls at csir.co.za> wrote:
Hi all,

Can anyone provide a simple link to an actual working example of using the CSW harvester; there are detailed instructions on what all the parameters mean on their git site, but this has not helped me to get an instance working.

I have tried through the CKAN web interface (http://localhost/harvest/) to set up a test example:

URL: http://data.linz.govt.nz/feeds/csw?service=CSW&version=2.0.2&request=GetCapabilities
Title: LINZ Data Service
Source: CSW Server
Update Frequency: Manual
Configuration: { "remote_groups": "create", "remote_orgs": "create", "user":"derek", "private_datasets": false}
Organisation: linz

The URL above is one which works and can be tested from a browser.  (I have also tried with the  Configuration setting empty.)

I have tried the Reharvest as well, but the only output that ever shows on the "Jobs" tab is:

Harvest Jobs
    Job: dfa53fbc-4c11-421c-9c7d-b1e768c2785d Running
    Started: Not yet ? Finished: Not yet

I am sure all of this is obvious to experienced users, but it does not seem so simple for beginners.

Any advice, insights or suggestions would be appreciated.

Thanks,
Derek

 
--  
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.  
This message has been scanned for viruses and dangerous content by MailScanner,  
and is believed to be clean. 
 
Please consider the environment before printing this email. 

 
 
_______________________________________________
 ckan-dev mailing list
 ckan-dev at lists.okfn.org
 https://lists.okfn.org/mailman/listinfo/ckan-dev
 Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
 




-- 
Chief Data Engineer

202-341-0191

@Data.gov
@Office of Citizen Science and Innovative Technologies/18F
General Services Administration


Work in the Open... ideate, innovate, iterate...


@github | @projectopendata










 
  
--  
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.  
This message has been scanned for viruses and dangerous content by MailScanner,  
and is believed to be clean. 
 
Please consider the environment before printing this email. 
 


 
--  
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.  
This message has been scanned for viruses and dangerous content by MailScanner,  
and is believed to be clean. 
 
Please consider the environment before printing this email. 

 


 
_______________________________________________
 ckan-dev mailing list
 ckan-dev at lists.okfn.org
 https://lists.okfn.org/mailman/listinfo/ckan-dev
 Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev
 



  
--  
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.  
This message has been scanned for viruses and dangerous content by MailScanner,  
and is believed to be clean. 
 
Please consider the environment before printing this email. 
 


-- 
This message is subject to the CSIR's copyright terms and conditions, e-mail legal notice, and implemented Open Document Format (ODF) standard. 
The full disclaimer details can be found at http://www.csir.co.za/disclaimer.html.

This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.

Please consider the environment before printing this email.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20160207/596b879f/attachment-0003.html>


More information about the ckan-dev mailing list