[ckan-changes] commit/ckanext-harvest: amercader: [merge] from new-forms, as forms refactoring has been merged in core

Fri May 20 12:50:34 UTC 2011

1 new changeset in ckanext-harvest:

http://bitbucket.org/okfn/ckanext-harvest/changeset/b2bac9d50c30/
changeset:   r98:b2bac9d50c30
user:        amercader
date:        2011-05-20 14:50:15
summary:     [merge] from new-forms, as forms refactoring has been merged in core
affected #:  28 files (19.4 KB)

--- a/README.rst	Tue May 17 17:26:42 2011 +0100
+++ b/README.rst	Fri May 20 13:50:15 2011 +0100
@@ -9,7 +9,7 @@
 ============
 
 The harvest extension uses Message Queuing to handle the different gather
-stages. 
+stages.
 
 You will need to install the RabbitMQ server::
 
@@ -23,12 +23,12 @@
 Configuration
 =============
 
-Run the following command (in the ckanext-harvest directory) to create 
+Run the following command (in the ckanext-harvest directory) to create
 the necessary tables in the database::
 
     paster harvester initdb --config=../ckan/development.ini
 
-The extension needs a user with sysadmin privileges to perform the 
+The extension needs a user with sysadmin privileges to perform the
 harvesting jobs. You can create such a user running these two commands in
 the ckan directory::
 
@@ -36,16 +36,6 @@
 
     paster sysadmin add harvest
 
-The user's API key must be defined in the CKAN
-configuration file (.ini) in the [app:main] section::
-
-    ckan.harvest.api_key = 4e1dac58-f642-4e54-bbc4-3ea262271fe2
-
-The API URL used can be also defined in the ini file (it defaults to 
-http://localhost:5000/)::
-
-    ckan.api_url = <api_url>
-
 Tests
 =====
 
@@ -63,25 +53,25 @@
 Command line interface
 ======================
 
-The following operations can be run from the command line using the 
+The following operations can be run from the command line using the
 ``paster harvester`` command::
 
       harvester initdb
         - Creates the necessary tables in the database
 
-      harvester source {url} {type} [{active}] [{user-id}] [{publisher-id}] 
+      harvester source {url} {type} [{active}] [{user-id}] [{publisher-id}]
         - create new harvest source
 
       harvester rmsource {id}
         - remove (inactivate) a harvester source
 
-      harvester sources [all]        
+      harvester sources [all]
         - lists harvest sources
           If 'all' is defined, it also shows the Inactive sources
 
       harvester job {source-id}
         - create new harvest job
-  
+
       harvester jobs
         - lists harvest jobs
 
@@ -93,13 +83,19 @@
 
       harvester fetch_consumer
         - starts the consumer for the fetching queue
-       
+
 The commands should be run from the ckanext-harvest directory and expect
-a development.ini file to be present. Most of the time you will specify 
+a development.ini file to be present. Most of the time you will specify
 the config explicitly though::
 
         paster harvester sources --config=../ckan/development.ini
 
+The CKAN haverster
+==================
+
+TODO
+
+
 The harvesting interface
 ========================
 
@@ -107,18 +103,18 @@
 operations. The harvesting process takes place on three stages:
 
 1. The **gather** stage compiles all the resource identifiers that need to
-   be fetched in the next stage (e.g. in a CSW server, it will perform a 
+   be fetched in the next stage (e.g. in a CSW server, it will perform a
    `GetRecords` operation).
 
 2. The **fetch** stage gets the contents of the remote objects and stores
-   them in the database (e.g. in a CSW server, it will perform n 
+   them in the database (e.g. in a CSW server, it will perform n
    `GetRecordById` operations).
 
 3. The **import** stage performs any necessary actions on the fetched
    resource (generally creating a CKAN package, but it can be anything the
    extension needs).
 
-Plugins willing to implement the harvesting interface must provide the 
+Plugins willing to implement the harvesting interface must provide the
 following methods::
 
     from ckan.plugins.core import SingletonPlugin, implements
@@ -130,17 +126,32 @@
     '''
     implements(IHarvester)
 
-    def get_type(self):
+    def info(self):
         '''
-        Plugins must provide this method, which will return a string with the
-        Harvester type implemented by the plugin (e.g ``CSW``,``INSPIRE``, etc).
-        This will ensure that they only receive Harvest Jobs and Objects
-        relevant to them.
+        Harvesting implementations must provide this method, which will return a
+        dictionary containing different descriptors of the harvester. The
+        returned dictionary should contain:
 
-        returns: A string with the harvester type
+        * name: machine-readable name. This will be the value stored in the
+          database, and the one used by ckanext-harvest to call the appropiate
+          harvester.
+        * title: human-readable name. This will appear in the form's select box
+          in the WUI.
+        * description: a small description of what the harvester does. This will
+          appear on the form as a guidance to the user.
+
+        A complete example may be::
+
+            {
+                'name': 'csw',
+                'title': 'CSW Server',
+                'description': 'A server that implements OGC's Catalog Service
+                                for the Web (CSW) standard'
+            }
+
+        returns: A dictionary with the harvester descriptors
         '''
 
-
     def gather_stage(self, harvest_job):
         '''
         The gather stage will recieve a HarvestJob object and will be
@@ -176,7 +187,7 @@
         '''
         The import stage will receive a HarvestObject object and will be
         responsible for:
-            - performing any necessary action with the fetched object (e.g 
+            - performing any necessary action with the fetched object (e.g
               create a CKAN package).
               Note: if this stage creates or updates a package, a reference
               to the package should be added to the HarvestObject.
@@ -200,7 +211,7 @@
 
 The harvesting extension uses two different queues, one that handles the
 gathering and another one that handles the fetching and importing. To start
-the consumers run the following command from the ckanext-harvest directory 
+the consumers run the following command from the ckanext-harvest directory
 (make sure you have your python environment activated)::
 
       paster harvester gather_consumer --config=../ckan/development.ini


--- a/ckanext/harvest/commands/harvester.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/commands/harvester.py	Fri May 20 13:50:15 2011 +0100
@@ -14,19 +14,19 @@
       harvester initdb
         - Creates the necessary tables in the database
 
-      harvester source {url} {type} [{active}] [{user-id}] [{publisher-id}] 
+      harvester source {url} {type} [{active}] [{user-id}] [{publisher-id}]
         - create new harvest source
 
       harvester rmsource {id}
         - remove (inactivate) a harvester source
 
-      harvester sources [all]        
+      harvester sources [all]
         - lists harvest sources
           If 'all' is defined, it also shows the Inactive sources
 
       harvester job {source-id}
         - create new harvest job
-  
+
       harvester jobs
         - lists harvest jobs
 
@@ -66,7 +66,7 @@
             sys.exit(1)
         cmd = self.args[0]
         if cmd == 'source':
-            self.create_harvest_source()            
+            self.create_harvest_source()
         elif cmd == "rmsource":
             self.remove_harvest_source()
         elif cmd == 'sources':
@@ -96,7 +96,7 @@
 
     def _load_config(self):
         super(Harvester, self)._load_config()
-    
+
     def initdb(self):
         from ckanext.harvest.model import setup as db_setup
         db_setup()
@@ -128,23 +128,29 @@
             publisher_id = unicode(self.args[5])
         else:
             publisher_id = u''
-        
-        source = create_harvest_source({
-                'url':url,
-                'type':type,
-                'active':active,
-                'user_id':user_id, 
-                'publisher_id':publisher_id})
+        try:
+            source = create_harvest_source({
+                    'url':url,
+                    'type':type,
+                    'active':active,
+                    'user_id':user_id,
+                    'publisher_id':publisher_id})
 
-        print 'Created new harvest source:'
-        self.print_harvest_source(source)
+            print 'Created new harvest source:'
+            self.print_harvest_source(source)
 
-        sources = get_harvest_sources()
-        self.print_there_are('harvest source', sources)
-        
-        # Create a Harvest Job for the new Source
-        create_harvest_job(source['id'])
-        print 'A new Harvest Job for this source has also been created'
+            sources = get_harvest_sources()
+            self.print_there_are('harvest source', sources)
+
+            # Create a Harvest Job for the new Source
+            create_harvest_job(source['id'])
+            print 'A new Harvest Job for this source has also been created'
+
+        except ValidationError,e:
+           print 'An error occurred:'
+           print str(e.error_dict)
+           raise e
+
 
     def remove_harvest_source(self):
         if len(self.args) >= 2:
@@ -155,7 +161,7 @@
 
         remove_harvest_source(source_id)
         print 'Removed harvest source: %s' % source_id
-    
+
     def list_harvest_sources(self):
         if len(self.args) >= 2 and self.args[1] == 'all':
             sources = get_harvest_sources()
@@ -185,7 +191,7 @@
         jobs = get_harvest_jobs()
         self.print_harvest_jobs(jobs)
         self.print_there_are(what='harvest job', sequence=jobs)
-    
+
     def run_harvester(self):
         try:
             jobs = run_harvest_jobs()
@@ -211,7 +217,7 @@
         print 'Source id: %s' % source['id']
         print '      url: %s' % source['url']
         print '     type: %s' % source['type']
-        print '   active: %s' % source['active'] 
+        print '   active: %s' % source['active']
         print '     user: %s' % source['user_id']
         print 'publisher: %s' % source['publisher_id']
         print '     jobs: %s' % len(source['jobs'])
@@ -234,7 +240,7 @@
         if (len(job['gather_errors']) > 0):
             for error in job['gather_errors']:
                 print '               %s' % error['message']
-        
+
         print ''
 
     def print_there_are(self, what, sequence, condition=''):


--- a/ckanext/harvest/controllers/view.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/controllers/view.py	Fri May 20 13:50:15 2011 +0100
@@ -1,21 +1,21 @@
-import urllib2
-
 from pylons.i18n import _
 
 import ckan.lib.helpers as h, json
 from ckan.lib.base import BaseController, c, g, request, \
                           response, session, render, config, abort, redirect
 
-from ckan.model import Package
-
-from ckanext.harvest.lib import *
+from ckan.lib.navl.dictization_functions import DataError
+from ckan.logic import NotFound, ValidationError
+from ckanext.harvest.logic.schema import harvest_source_form_schema
+from ckanext.harvest.lib import create_harvest_source, edit_harvest_source, \
+                                get_harvest_source, get_harvest_sources, \
+                                create_harvest_job, get_registered_harvesters_info
+    
+import logging
+log = logging.getLogger(__name__)
 
 class ViewController(BaseController):
 
-    api_url = config.get('ckan.api_url', 'http://localhost:5000').rstrip('/')+'/api/2/rest'
-    form_api_url = config.get('ckan.api_url', 'http://localhost:5000').rstrip('/')+'/api/2/form'
-    api_key = config.get('ckan.harvest.api_key')
-
     def __before__(self, action, **env):
         super(ViewController, self).__before__(action, **env)
         # All calls to this controller must be with a sysadmin key
@@ -24,138 +24,122 @@
             status = 401
             abort(status, response_msg)
 
-    def _do_request(self,url,data = None):
-
-        http_request = urllib2.Request(
-            url = url,
-            headers = {'Authorization' : self.api_key}
-        )
-
-        if data:
-            http_request.add_data(data)
-
-        try:
-            return urllib2.urlopen(http_request)
-        except urllib2.HTTPError as e:
-            raise
-
     def index(self):
         # Request all harvest sources
         c.sources = get_harvest_sources()
 
-        return render('ckanext/harvest/index.html')
+        return render('index.html')
 
-    def create(self):
+    def new(self,data = None,errors = None, error_summary = None):
 
-        # This is the DGU form API, so we don't use self.api_url
-        form_url = self.form_api_url + '/harvestsource/create'
-        if request.method == 'GET':
+        if ('save' in request.params) and not data:
+            return self._save_new()
+
+        data = data or {}
+        errors = errors or {}
+        error_summary = error_summary or {}
+        #TODO: Use new description interface to build the types select and descriptions
+        vars = {'data': data, 'errors': errors, 'error_summary': error_summary, 'harvesters': get_registered_harvesters_info()}
+        
+        c.form = render('source/new_source_form.html', extra_vars=vars)
+        return render('source/new.html')
+
+    def _save_new(self):
+        try:
+            data_dict = dict(request.params)
+            self._check_data_dict(data_dict)
+
+            source = create_harvest_source(data_dict)
+
+            # Create a harvest job for the new source
+            create_harvest_job(source['id'])
+
+            h.flash_success(_('New harvest source added successfully.'
+                    'A new harvest job for the source has also been created.'))
+            redirect(h.url_for('harvest'))
+        except DataError,e:
+            abort(400, 'Integrity Error')
+        except ValidationError,e:
+            errors = e.error_dict
+            error_summary = e.error_summary if hasattr(e,'error_summary') else None
+            return self.new(data_dict, errors, error_summary)
+
+    def edit(self, id, data = None,errors = None, error_summary = None):
+
+        if ('save' in request.params) and not data:
+            return self._save_edit(id)
+
+        if not data:
             try:
-                # Request the fields
-                c.form = self._do_request(form_url).read()
-                c.mode = 'create'
-            except urllib2.HTTPError as e:
-                msg = 'An error occurred: [%s %s]' % (str(e.getcode()),e.msg)
-                h.flash_error(msg)
-            return render('ckanext/harvest/create.html')
-        if request.method == 'POST':
-            # Build an object like the one expected by the DGU form API
-            data = {
-                'form_data':
-                    {'HarvestSource--url': request.POST['HarvestSource--url'],
-                     'HarvestSource--description': request.POST['HarvestSource--description'],
-                     'HarvestSource--type': request.POST['HarvestSource--type'],
-                    },
-                'user_id':'',
-                'publisher_id':''
-            }
-            data = json.dumps(data)
-            try:
-                rq = self._do_request(form_url,data)
+                old_data = get_harvest_source(id)
+            except NotFound:
+                abort(404, _('Harvest Source not found'))
 
-                h.flash_success('Harvesting source added successfully')
-                redirect(h.url_for('harvest'))
+        data = data or old_data
+        errors = errors or {}
+        error_summary = error_summary or {}
+        #TODO: Use new description interface to build the types select and descriptions
+        vars = {'data': data, 'errors': errors, 'error_summary': error_summary, 'harvesters': get_registered_harvesters_info()}
+        
+        c.form = render('source/new_source_form.html', extra_vars=vars)
+        return render('source/edit.html')
 
-            except urllib2.HTTPError as e:
-                msg = 'An error occurred: [%s %s]' % (str(e.getcode()),e.msg)
-                # The form API returns just a 500, so we are not exactly sure of what
-                # happened, but most probably it was a duplicate entry
-                if e.getcode() == 500:
-                    msg = msg + ' Does the source already exist?'
-                elif e.getcode() == 400:
-                    err_msg = e.read()
-                    if '<form' in c.form:
-                        c.form = err_msg
-                        c.mode = 'create'
-                        return render('ckanext/harvest/create.html')
-                    else:
-                        msg = err_msg
+    def _save_edit(self,id):
+        try:
+            data_dict = dict(request.params)
+            self._check_data_dict(data_dict)
 
-                h.flash_error(msg)
-                redirect(h.url_for('harvest'))
+            source = edit_harvest_source(id,data_dict)
 
-    def show(self,id):
+            h.flash_success(_('Harvest source edited successfully.'))
+            redirect(h.url_for('harvest'))
+        except DataError,e:
+            abort(400, _('Integrity Error'))
+        except NotFound, e:
+            abort(404, _('Harvest Source not found'))
+        except ValidationError,e:
+            errors = e.error_dict
+            error_summary = e.error_summary if hasattr(e,'error_summary') else None
+            return self.edit(id,data_dict, errors, error_summary)
+
+    def _check_data_dict(self, data_dict):
+        '''Check if the return data is correct'''
+        surplus_keys_schema = ['id','publisher_id','user_id','active','save']
+
+        schema_keys = harvest_source_form_schema().keys()
+        keys_in_schema = set(schema_keys) - set(surplus_keys_schema)
+
+        if keys_in_schema - set(data_dict.keys()):
+            log.info(_('Incorrect form fields posted'))
+            raise DataError(data_dict)
+
+    def read(self,id):
         try:
             c.source = get_harvest_source(id)
 
-            return render('ckanext/harvest/show.html')
-        except:
-            abort(404,'Harvest source not found')
+            return render('source/read.html')
+        except NotFound:
+            abort(404,_('Harvest source not found'))
 
 
     def delete(self,id):
         try:
             delete_harvest_source(id)
-            h.flash_success('Harvesting source deleted successfully')
-        except Exception as e:
+
+            h.flash_success(_('Harvesting source deleted successfully'))
+            redirect(h.url_for('harvest'))
+        except NotFound:
+            abort(404,_('Harvest source not found'))
+
+
+    def create_harvesting_job(self,id):
+        try:
+            create_harvest_job(id)
+            h.flash_success(_('Refresh requested, harvesting will take place within 15 minutes.'))
+        except NotFound:
+            abort(404,_('Harvest source not found'))
+        except Exception, e:
             msg = 'An error occurred: [%s]' % e.message
             h.flash_error(msg)
 
         redirect(h.url_for('harvest'))
-
-    def edit(self,id):
-
-        form_url = self.form_api_url + '/harvestsource/edit/%s' % id
-        if request.method == 'GET':
-            # Request the fields
-            c.form = self._do_request(form_url).read()
-            c.mode = 'edit'
-
-            return render('ckanext/harvest/create.html')
-        if request.method == 'POST':
-            # Build an object like the one expected by the DGU form API
-            data = {
-                'form_data':
-                    {'HarvestSource-%s-url' % id: request.POST['HarvestSource-%s-url' % id] ,
-                     'HarvestSource-%s-type' % id: request.POST['HarvestSource-%s-type' % id],
-                     'HarvestSource-%s-description' % id: request.POST['HarvestSource-%s-description' % id]},
-                'user_id':'',
-                'publisher_id':''
-            }
-            data = json.dumps(data)
-            try:
-                r = self._do_request(form_url,data)
-
-                h.flash_success('Harvesting source edited successfully')
-
-                redirect(h.url_for('harvest'))
-            except urllib2.HTTPError as e:
-                if e.getcode() == 400:
-                    c.form = e.read()
-                    c.mode = 'edit'
-                    return render('ckanext/harvest/create.html')
-                else:
-                    msg = 'An error occurred: [%s %s]' % (str(e.getcode()),e.msg)
-                    h.flash_error(msg)
-                    redirect(h.url_for('harvest'))
-
-    def create_harvesting_job(self,id):
-        try:
-            create_harvest_job(id)
-            h.flash_success('Refresh requested, harvesting will take place within 15 minutes.')
-        except Exception as e:
-            msg = 'An error occurred: [%s]' % e.message
-            h.flash_error(msg)
-
-        redirect(h.url_for('harvest'))
-


--- a/ckanext/harvest/harvesters.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/harvesters.py	Fri May 20 13:50:15 2011 +0100
@@ -78,8 +78,12 @@
         err.save()
         log.error(message)
 
-    def get_type(self):
-        return 'CKAN'
+    def info(self):
+        return {
+            'name': 'ckan',
+            'title': 'CKAN',
+            'description': 'Harvests remote CKAN instances'
+        }
 
     def gather_stage(self,harvest_job):
         log.debug('In CKANHarvester gather_stage (%s)' % harvest_job.source.url)


--- a/ckanext/harvest/interfaces.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/interfaces.py	Fri May 20 13:50:15 2011 +0100
@@ -6,17 +6,32 @@
 
     '''
 
-    def get_type(self):
+    def info(self):
         '''
-        Plugins must provide this method, which will return a string with the
-        Harvester type implemented by the plugin (e.g ``CSW``,``INSPIRE``, etc).
-        This will ensure that they only receive Harvest Jobs and Objects
-        relevant to them.
+        Harvesting implementations must provide this method, which will return a
+        dictionary containing different descriptors of the harvester. The
+        returned dictionary should contain:
 
-        returns: A string with the harvester type
+        * name: machine-readable name. This will be the value stored in the
+          database, and the one used by ckanext-harvest to call the appropiate
+          harvester.
+        * title: human-readable name. This will appear in the form's select box
+          in the WUI.
+        * description: a small description of what the harvester does. This will
+          appear on the form as a guidance to the user.
+
+        A complete example may be::
+
+            {
+                'name': 'csw',
+                'title': 'CSW Server',
+                'description': 'A server that implements OGC's Catalog Service
+                                for the Web (CSW) standard'
+            }
+
+        returns: A dictionary with the harvester descriptors
         '''
 
-
     def gather_stage(self, harvest_job):
         '''
         The gather stage will recieve a HarvestJob object and will be
@@ -55,7 +70,7 @@
         '''
         The import stage will receive a HarvestObject object and will be
         responsible for:
-            - performing any necessary action with the fetched object (e.g 
+            - performing any necessary action with the fetched object (e.g
               create a CKAN package).
               Note: if this stage creates or updates a package, a reference
               to the package should be added to the HarvestObject.


--- a/ckanext/harvest/lib/__init__.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/lib/__init__.py	Fri May 20 13:50:15 2011 +0100
@@ -1,14 +1,22 @@
 import urlparse
+import re
+
 from sqlalchemy import distinct,func
 from ckan.model import Session, repo
 from ckan.model import Package
+from ckan.lib.navl.dictization_functions import validate
+from ckan.logic import NotFound, ValidationError
+
+from ckanext.harvest.logic.schema import harvest_source_form_schema
+
 from ckan.plugins import PluginImplementations
 from ckanext.harvest.model import HarvestSource, HarvestJob, HarvestObject, \
                                   HarvestGatherError, HarvestObjectError
 from ckanext.harvest.queue import get_gather_publisher
 from ckanext.harvest.interfaces import IHarvester
 
-log = __import__("logging").getLogger(__name__)
+import logging
+log = logging.getLogger('ckanext')
 
 
 def _get_source_status(source):
@@ -40,8 +48,8 @@
     if  last_job:
         #TODO: Should we encode the dates as strings?
         out['last_harvest_request'] = str(last_job.gather_finished)
-        
-       
+
+
         #Get HarvestObjects from last job whit links to packages
         last_objects = [obj for obj in last_job.objects if obj.package is not None]
 
@@ -68,8 +76,8 @@
         # We have the gathering errors in last_job.gather_errors, so let's also
         # get also the object errors.
         object_errors = Session.query(HarvestObjectError).join(HarvestObject) \
-                            .filter(HarvestObject.job==last_job).all()        
-        
+                            .filter(HarvestObject.job==last_job).all()
+
         out['last_harvest_statistics']['errors'] = len(last_job.gather_errors) \
                                             + len(object_errors)
         for gather_error in last_job.gather_errors:
@@ -79,7 +87,7 @@
             msg = 'GUID %s: %s' % (object_error.object.guid,object_error.message)
             out['last_harvest_errors'].append(msg)
 
-        
+
 
         # Overall statistics
         packages = Session.query(distinct(HarvestObject.package_id),Package.name) \
@@ -112,7 +120,7 @@
 
     for job in source.jobs:
         out['jobs'].append(job.as_dict())
-    
+
     out['status'] = _get_source_status(source)
 
 
@@ -171,7 +179,7 @@
             netloc = ':'.join(parts)
     else:
         netloc = o.netloc
-    
+
     # Remove trailing slash
     path = o.path.rstrip('/')
 
@@ -183,61 +191,83 @@
 
     return check_url
 
-def get_harvest_source(id,default=Exception,attr=None):
-    source = HarvestSource.get(id,default=default,attr=attr)
-    if source:
-        return _source_as_dict(source)
-    else:
-        return default
+def _prettify(field_name):
+    field_name = re.sub('(?<!\w)[Uu]rl(?!\w)', 'URL', field_name.replace('_', ' ').capitalize())
+    return field_name.replace('_', ' ')
 
-def get_harvest_sources(**kwds):
-    sources = HarvestSource.filter(**kwds).all()
-    return [_source_as_dict(source) for source in sources]
+def _error_summary(error_dict):
+    error_summary = {}
+    for key, error in error_dict.iteritems():
+        error_summary[_prettify(key)] = error[0]
+    return error_summary
 
-def create_harvest_source(source_dict):
-    if not 'url' in source_dict or not source_dict['url'] or \
-        not 'type' in source_dict or not source_dict['type']:
-        raise Exception('Missing mandatory properties: url, type')
+def get_harvest_source(id,attr=None):
+    source = HarvestSource.get(id,attr=attr)
 
-    # Check if source already exists
-    existing_source = _url_exists(source_dict['url'])
-    if existing_source:
-        raise Exception('There already is an active Harvest Source for this URL: %s' % source_dict['url'])
-
-    source = HarvestSource()
-    source.url = source_dict['url']
-    source.type = source_dict['type']
-    opt = ['active','description','user_id','publisher_id']
-    for o in opt:
-        if o in source_dict and source_dict[o] is not None:
-            source.__setattr__(o,source_dict[o])
-
-    source.save()
-
+    if not source:
+        raise NotFound
 
     return _source_as_dict(source)
 
-def edit_harvest_source(source_id,source_dict):
-    try:
-        source = HarvestSource.get(source_id)
-    except:
-        raise Exception('Source %s does not exist' % source_id)
-    fields = ['url','type','active','description','user_id','publisher_id']
-    for f in fields:
-        if f in source_dict and source_dict[f] is not None and source_dict[f] != '':
-            source.__setattr__(f,source_dict[f])
+def get_harvest_sources(**kwds):
+    sources = HarvestSource.filter(**kwds) \
+                .order_by(HarvestSource.created.desc()) \
+                .all()
+    return [_source_as_dict(source) for source in sources]
+
+def create_harvest_source(data_dict):
+
+    schema = harvest_source_form_schema()
+    data, errors = validate(data_dict, schema)
+
+    if errors:
+        Session.rollback()
+        raise ValidationError(errors,_error_summary(errors))
+
+    source = HarvestSource()
+    source.url = data['url']
+    source.type = data['type']
+
+    opt = ['active','description','user_id','publisher_id']
+    for o in opt:
+        if o in data and data[o] is not None:
+            source.__setattr__(o,data[o])
 
     source.save()
 
     return _source_as_dict(source)
 
+def edit_harvest_source(source_id,data_dict):
+    schema = harvest_source_form_schema()
+
+    source = HarvestSource.get(source_id)
+    if not source:
+        raise NotFound('Harvest source %s does not exist' % source_id)
+
+    # Add source id to the dict, as some validators will need it
+    data_dict["id"] = source.id
+
+    data, errors = validate(data_dict, schema)
+    if errors:
+        Session.rollback()
+        raise ValidationError(errors,_error_summary(errors))
+
+    fields = ['url','type','active','description','user_id','publisher_id']
+    for f in fields:
+        if f in data_dict and data_dict[f] is not None and data_dict[f] != '':
+            source.__setattr__(f,data_dict[f])
+
+    source.save()
+
+    return _source_as_dict(source)
+
 
 def remove_harvest_source(source_id):
-    try:
-        source = HarvestSource.get(source_id)
-    except:
-        raise Exception('Source %s does not exist' % source_id)
-    
+
+    source = HarvestSource.get(source_id)
+    if not source:
+        raise NotFound('Harvest source %s does not exist' % source_id)
+
     # Don't actually delete the record, just flag it as inactive
     source.active = False
     source.save()
@@ -251,12 +281,12 @@
 
     return True
 
-def get_harvest_job(id,default=Exception,attr=None):
-    job = HarvestJob.get(id,default=default,attr=attr)
-    if job:
-        return _job_as_dict(job)
-    else:
-        return default
+def get_harvest_job(id,attr=None):
+    job = HarvestJob.get(id,attr=attr)
+    if not job:
+        raise NotFound
+
+    return _job_as_dict(job)
 
 def get_harvest_jobs(**kwds):
     jobs = HarvestJob.filter(**kwds).all()
@@ -264,11 +294,9 @@
 
 def create_harvest_job(source_id):
     # Check if source exists
-    try:
-        #We'll need the actual HarvestSource
-        source = HarvestSource.get(source_id)
-    except:
-        raise Exception('Source %s does not exist' % source_id)
+    source = HarvestSource.get(source_id)
+    if not source:
+        raise NotFound('Harvest source %s does not exist' % source_id)
 
     # Check if the source is active
     if not source.active:
@@ -291,7 +319,7 @@
     jobs = get_harvest_jobs(status=u'New')
     if len(jobs) == 0:
         raise Exception('There are no new harvesting jobs')
-    
+
     # Send each job to the gather queue
     publisher = get_gather_publisher()
     sent_jobs = []
@@ -304,12 +332,12 @@
     publisher.close()
     return sent_jobs
 
-def get_harvest_object(id,default=Exception,attr=None):
-    obj = HarvestObject.get(id,default=default,attr=attr)
-    if obj:
-        return _object_as_dict(obj)
-    else:
-        return default
+def get_harvest_object(id,attr=None):
+    obj = HarvestObject.get(id,attr=attr)
+    if not obj:
+        raise NotFound
+
+    return _object_as_dict(obj)
 
 def get_harvest_objects(**kwds):
     objects = HarvestObject.filter(**kwds).all()
@@ -317,10 +345,10 @@
 
 def import_last_objects(source_id=None):
     if source_id:
-        try:
-            source = HarvestSource.get(source_id)
-        except:
-            raise Exception('Source %s does not exist' % source_id)
+        source = HarvestSource.get(source_id)
+        if not source:
+            raise NotFound('Harvest source %s does not exist' % source_id)
+
         last_objects = Session.query(HarvestObject) \
                 .join(HarvestJob) \
                 .filter(HarvestJob.source==source) \
@@ -344,10 +372,22 @@
         if obj.guid != last_obj_guid:
             imported_objects.append(obj)
             for harvester in PluginImplementations(IHarvester):
-                if harvester.get_type() == obj.job.source.type:
+                if harvester.info()['name'] == obj.job.source.type:
                     if hasattr(harvester,'force_import'):
                         harvester.force_import = True
                     harvester.import_stage(obj)
         last_obj_guid = obj.guid
 
     return imported_objects
+
+def get_registered_harvesters_info():
+    # TODO: Use new description interface when implemented
+    available_harvesters = []
+    for harvester in PluginImplementations(IHarvester):
+        info = harvester.info()
+        if not info or 'name' not in info:
+            log.error('Harvester %r does not provide the harvester name in the info response' % str(harvester))
+            continue
+        available_harvesters.append(info)
+
+    return available_harvesters


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/logic/__init__.py	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,7 @@
+try:
+    import pkg_resources
+    pkg_resources.declare_namespace(__name__)
+except ImportError:
+    import pkgutil
+    __path__ = pkgutil.extend_path(__path__, __name__)
+


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/logic/schema.py	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,33 @@
+from ckan.lib.navl.validators import (ignore_missing,
+                                      not_empty,
+                                      empty,
+                                      ignore,
+                                      not_missing
+                                     )
+
+from ckanext.harvest.logic.validators import harvest_source_id_exists, \
+                                            harvest_source_url_validator, \
+                                            harvest_source_type_exists
+
+def default_harvest_source_schema():
+
+    schema = {
+        'id': [ignore_missing, unicode, harvest_source_id_exists],
+        'url': [not_empty, unicode, harvest_source_url_validator],
+        'type': [not_empty, unicode, harvest_source_type_exists],
+        'description': [ignore_missing],
+        'active': [ignore_missing],
+        'user_id': [ignore_missing],
+        'publisher_id': [ignore_missing],
+        #'config'
+    }
+
+    return schema
+
+
+def harvest_source_form_schema():
+
+    schema = default_harvest_source_schema()
+    schema['save'] = [ignore]
+
+    return schema


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/logic/validators.py	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,79 @@
+import urlparse
+
+from ckan.lib.navl.dictization_functions import Invalid, missing
+from ckan.model import Session
+from ckan.plugins import PluginImplementations
+
+from ckanext.harvest.model import HarvestSource
+from ckanext.harvest.interfaces import IHarvester
+ 
+
+#TODO: use context?
+
+def harvest_source_id_exists(value, context):
+    
+    result = HarvestSource.get(value,None)
+
+    if not result:
+        raise Invalid('Harvest Source with id %r does not exist.' % str(value))
+    return value
+
+def _normalize_url(url):
+    o = urlparse.urlparse(url)
+
+    # Normalize port
+    if ':' in o.netloc:
+        parts = o.netloc.split(':')
+        if (o.scheme == 'http' and parts[1] == '80') or \
+           (o.scheme == 'https' and parts[1] == '443'):
+            netloc = parts[0]
+        else:
+            netloc = ':'.join(parts)
+    else:
+        netloc = o.netloc
+    
+    # Remove trailing slash
+    path = o.path.rstrip('/')
+
+    check_url = urlparse.urlunparse((
+            o.scheme,
+            netloc,
+            path,
+            None,None,None))
+
+    return check_url
+
+def harvest_source_url_validator(key,data,errors,context):
+    new_url = _normalize_url(data[key])
+    source_id = data.get(('id',),'')
+    if source_id:
+        # When editing a source we need to avoid its own URL
+        existing_sources = Session.query(HarvestSource.url,HarvestSource.active) \
+                       .filter(HarvestSource.id!=source_id).all()
+    else:
+        existing_sources = Session.query(HarvestSource.url,HarvestSource.active).all()
+
+    for url,active in existing_sources:
+        url = _normalize_url(url)
+        if url == new_url and active == True:
+            raise Invalid('There already is an active Harvest Source for this URL: %s' % data[key])
+
+    return data[key] 
+
+def harvest_source_type_exists(value,context):
+    #TODO: use new description interface
+
+    # Get all the registered harvester types
+    available_types = []
+    for harvester in PluginImplementations(IHarvester):
+        info = harvester.info()
+        if not info or 'name' not in info:
+            log.error('Harvester %r does not provide the harvester name in the info response' % str(harvester))
+            continue
+        available_types.append(info['name'])
+
+
+    if not value in available_types:
+        raise Invalid('Unknown harvester type: %s. Have you registered a harvester for this type?' % value)
+    
+    return value


--- a/ckanext/harvest/model/__init__.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/model/__init__.py	Fri May 20 13:50:15 2011 +0100
@@ -31,7 +31,7 @@
     key_attr = 'id'
 
     @classmethod
-    def get(self, key, default=Exception, attr=None):
+    def get(self, key, default=None, attr=None):
         '''Finds a single entity in the register.'''
         if attr == None:
             attr = self.key_attr
@@ -39,10 +39,8 @@
         o = self.filter(**kwds).first()
         if o:
             return o
-        if default != Exception:
+        else:
             return default
-        else:
-            raise Exception('%s not found: %s' % (self.__name__, key))
 
     @classmethod
     def filter(self, **kwds):


--- a/ckanext/harvest/plugin.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/plugin.py	Fri May 20 13:50:15 2011 +0100
@@ -22,35 +22,17 @@
         pass
 
     def before_map(self, map):
-        map.connect('harvest', '/harvest',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            action='index')
-            
-        map.connect('harvest_create_form', '/harvest/create',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            conditions=dict(method=['GET']),
-            action='create')
 
-        map.connect('harvest_create', '/harvest/create',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            conditions=dict(method=['POST']),
-            action='create')
+        controller = 'ckanext.harvest.controllers.view:ViewController'
+        map.connect('harvest', '/harvest',controller=controller,action='index')
 
-        map.connect('harvest_show', '/harvest/:id',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            action='show')
+        map.connect('/harvest/new', controller=controller, action='new')
+        map.connect('/harvest/edit/:id', controller=controller, action='edit') 
+        map.connect('/harvest/delete/:id',controller=controller, action='delete')
+        map.connect('/harvest/:id', controller=controller, action='read')
 
-        map.connect('harvest_edit', '/harvest/:id/edit',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            action='edit')
-
-        map.connect('harvest_delete', '/harvest/:id/delete',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            action='delete')
-
-        map.connect('harvesting_job_create', '/harvest/:id/refresh',
-            controller='ckanext.harvest.controllers.view:ViewController',
-            action='create_harvesting_job')
+        map.connect('harvesting_job_create', '/harvest/refresh/:id',controller=controller, 
+                action='create_harvesting_job')
       
         return map
 


Binary file ckanext/harvest/public/ckanext/harvest/images/icons/source_delete.png has changed


Binary file ckanext/harvest/public/ckanext/harvest/images/icons/source_edit.png has changed


Binary file ckanext/harvest/public/ckanext/harvest/images/icons/source_new.png has changed


Binary file ckanext/harvest/public/ckanext/harvest/images/icons/source_refresh.png has changed


Binary file ckanext/harvest/public/ckanext/harvest/images/icons/source_view.png has changed


--- a/ckanext/harvest/public/ckanext/harvest/style.css	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/public/ckanext/harvest/style.css	Fri May 20 13:50:15 2011 +0100
@@ -1,4 +1,15 @@
 /* Harvest styles */
 #new-harvest-source {
+	background: transparent url("images/icons/source_new.png") no-repeat 0px 0px;
+    padding-left: 20px;
+    margin-bottom: 10px;
     font-weight: bold;
 }
+
+#harvest-sources th.action{
+    font-style: italic;
+}
+
+.harvester-title{
+    font-weight: bold;
+}


--- a/ckanext/harvest/queue.py	Tue May 17 17:26:42 2011 +0100
+++ b/ckanext/harvest/queue.py	Fri May 20 13:50:15 2011 +0100
@@ -11,7 +11,7 @@
 from ckanext.harvest.model import HarvestJob, HarvestObject,HarvestGatherError
 from ckanext.harvest.interfaces import IHarvester
 
-log = logging.getLogger(__name__)
+log = logging.getLogger('ckanext')
 
 __all__ = ['get_gather_publisher', 'get_gather_consumer', \
            'get_fetch_publisher', 'get_fetch_consumer']
@@ -77,7 +77,7 @@
             # matches
             harvester_found = False
             for harvester in PluginImplementations(IHarvester):
-                if harvester.get_type() == job.source.type:
+                if harvester.info()['name'] == job.source.type:
                     harvester_found = True
                     # Get a list of harvest object ids from the plugin
                     job.gather_started = datetime.datetime.now()
@@ -123,7 +123,7 @@
             # the Harvester interface, only if the source type
             # matches
             for harvester in PluginImplementations(IHarvester):
-                if harvester.get_type() == obj.source.type:
+                if harvester.info()['name'] == obj.source.type:
 
                     # See if the plugin can fetch the harvest object
                     obj.fetch_started = datetime.datetime.now()


--- a/ckanext/harvest/templates/ckanext/harvest/create.html	Tue May 17 17:26:42 2011 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,28 +0,0 @@
-<?python
-    if c.mode == 'create':
-        title = 'Add harvesting source'
-    else:
-        title = 'Edit harvesting source'
-?>
-<html xmlns:py="http://genshi.edgewall.org/"
-  xmlns:i18n="http://genshi.edgewall.org/i18n"
-  xmlns:xi="http://www.w3.org/2001/XInclude"
-  py:strip="">
-
-  <py:def function="page_title">${title}</py:def>
-
-  <py:def function="optional_head">
-    <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
-  </py:def>
-  
-<div py:match="content">
-    <div class="harvest-content">
-        <h1>${title}</h1>
-        <form action="${c.mode}" method="POST">
-        ${Markup(c.form)}
-            <input id="save" name="save" value="Save" type="submit" />
-        </form>
-    </div>
-</div>
-<xi:include href="../../layout.html" />
-</html>


--- a/ckanext/harvest/templates/ckanext/harvest/index.html	Tue May 17 17:26:42 2011 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,63 +0,0 @@
-<html xmlns:py="http://genshi.edgewall.org/"
-  xmlns:i18n="http://genshi.edgewall.org/i18n"
-  xmlns:xi="http://www.w3.org/2001/XInclude"
-  py:strip="">
-
-  <py:def function="page_title">Harvesting Sources</py:def>
-
-  <py:def function="optional_head">
-    <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
-  </py:def>
-
-<div py:match="content">
-  <div class="harvest-content">
-  <h1>Harvesting Sources</h1>
-    <a id="new-harvest-source" href="harvest/create">Add a harvesting source</a>
-    <py:choose>
-        <py:when test="c.sources">
-
-
-       <table id="harvest-sources">
-        <tr>
-            <th></th>
-            <th></th>
-            <th></th>
-            <th>URL</th>
-            <th>Type</th>
-            <th>Active</th>
-            <th>Statistics</th>
-            <th>Next Harvest</th>
-            <th>Created</th>
-        </tr>
-
-        <tr py:for="source in c.sources">
-            <td>${h.link_to('view', 'harvest/' + source.id)}</td>
-            <td>${h.link_to('edit', 'harvest/' + source.id + '/edit')}</td>
-            <td>${h.link_to('refresh', 'harvest/' + source.id + '/refresh')}</td>
-            <td>${source.url}</td>
-            <td>${source.type}</td>
-            <td>${source.active}</td>
-            <py:choose>
-                <py:when test="'msg' in source.status">
-                    <td>${source.status.msg}</td>
-                    <td>${source.status.msg}</td>
-                </py:when>
-                <py:otherwise>
-                    <td>${source.status.overall_statistics.added} pkgs ${source.status.overall_statistics.errors} errors</td>
-                    <td>${source.status.next_harvest}</td>
-                </py:otherwise>
-            </py:choose>
-
-            <td>${source.created}</td>
-         </tr>
-    </table>
-   </py:when>
-    <py:otherwise>
-        <div id="no-harvest-sources">No harvest sources defined yet.</div>
-    </py:otherwise>
-  </py:choose>
- 
-  </div>
-</div>
-<xi:include href="../../layout.html" />
-</html>


--- a/ckanext/harvest/templates/ckanext/harvest/show.html	Tue May 17 17:26:42 2011 +0100
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,93 +0,0 @@
-<html xmlns:py="http://genshi.edgewall.org/"
-  xmlns:i18n="http://genshi.edgewall.org/i18n"
-  xmlns:xi="http://www.w3.org/2001/XInclude"
-  py:strip="">
-
-  <py:def function="page_title">Harvest Source Details</py:def>
-
-  <py:def function="optional_head">
-    <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
-  </py:def>
-  
-<div py:match="content">
-  <div class="harvest-content">
-  <py:if test="c.source">
-  <h1>Harvest Source Details</h1>
-    <table id="harvest-source-details">
-        <tr>
-            <th>ID</th>
-            <td>${c.source.id}</td>
-        </tr>
-        <tr>
-            <th>URL</th>
-            <td>${c.source.url}</td>
-        </tr>
-        <tr>
-            <th>Type</th>
-            <td>${c.source.type}</td>
-        </tr>
-        <tr>
-            <th>Active</th>
-            <td>${c.source.active}</td>
-        </tr>
-        <tr>
-            <th>Description</th>
-            <td>${c.source.description}</td>
-        </tr>
-        <tr>
-            <th>User</th>
-            <td>${c.source.user_id}</td>
-        </tr>
-        <tr>
-            <th>Publisher</th>
-            <td>${c.source.publisher_id}</td>
-        </tr>
-        <tr>
-            <th>Created</th>
-            <td>${c.source.created}</td>
-        </tr>
-        <tr>
-            <th>Total jobs</th>
-            <td>${len(c.source.jobs)}</td>
-        </tr>
-        <tr>
-            <th>Status</th>
-            <td>
-                Last Harvest Errors: ${c.source.status.last_harvest_statistics.errors}<br/> 
-                <py:choose>
-                    <py:when test="len(c.source.status.last_harvest_errors)>0">
-                        <ul>
-                        <li py:for="error in c.source.status.last_harvest_errors">${error}</li>
-                        </ul>
-                    </py:when>
-                </py:choose>
-                Last Harvest Added: ${c.source.status.last_harvest_statistics.added}<br/>
-                Last Harvest Updated: ${c.source.status.last_harvest_statistics.updated}<br/>
-                Last Harvest: ${c.source.status.last_harvest_request} <br/>
-                Next Harvest: ${c.source.status.next_harvest}
-            </td>
-        </tr>
-        <tr>
-            <th>Total Errors</th>
-            <td>${c.source.status.overall_statistics.errors}</td>
-        </tr>
-        <tr>
-            <th>Total Packages</th>
-            <td>${c.source.status.overall_statistics.added}</td>
-        </tr>
-        <tr>
-            <th>Packages</th>
-            <td>
-                <div>There could be a 10 minutes delay before these packages (or changes to them) appear on 
-                the site or on search results.</div>
-                <div py:for="package in c.source.status.packages">
-                    <a href="/package/${package}">${package}</a>
-                </div>
-            </td>
-        </tr>
-    </table>
-    </py:if>
-  </div>
-</div>
-<xi:include href="../../layout.html" />
-</html>


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/templates/index.html	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,63 @@
+<html xmlns:py="http://genshi.edgewall.org/"
+  xmlns:i18n="http://genshi.edgewall.org/i18n"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  py:strip="">
+
+  <py:def function="page_title">Harvesting Sources</py:def>
+
+  <py:def function="optional_head">
+    <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
+  </py:def>
+
+<div py:match="content">
+  <div class="harvest-content">
+  <h1>Harvesting Sources</h1>
+    <div id="new-harvest-source"><a href="harvest/new">Add a harvesting source</a></div>
+    <py:choose>
+        <py:when test="c.sources">
+
+
+       <table id="harvest-sources">
+        <tr>
+            <th class="action">View</th>
+            <th class="action">Edit</th>
+            <th class="action">Refresh</th>
+            <th>URL</th>
+            <th>Type</th>
+            <th>Active</th>
+            <th>Statistics</th>
+            <th>Next Harvest</th>
+            <th>Created</th>
+        </tr>
+
+        <tr py:for="source in c.sources">
+            <td><a href="harvest/${source.id}"><img src="ckanext/harvest/images/icons/source_view.png" alt="View" title="View" /></a></td>
+            <td><a href="harvest/edit/${source.id}"><img src="ckanext/harvest/images/icons/source_edit.png" alt="Edit" title="Edit" /></a></td>
+            <td><a href="harvest/refresh/${source.id}"><img src="ckanext/harvest/images/icons/source_refresh.png" alt="Refresh" title="Refresh" /></a></td>
+            <td>${source.url}</td>
+            <td>${source.type}</td>
+            <td>${source.active}</td>
+            <py:choose>
+                <py:when test="'msg' in source.status">
+                    <td>${source.status.msg}</td>
+                    <td>${source.status.msg}</td>
+                </py:when>
+                <py:otherwise>
+                    <td>${source.status.overall_statistics.added} pkgs ${source.status.overall_statistics.errors} errors</td>
+                    <td>${source.status.next_harvest}</td>
+                </py:otherwise>
+            </py:choose>
+
+            <td>${source.created}</td>
+         </tr>
+    </table>
+   </py:when>
+    <py:otherwise>
+        <div id="no-harvest-sources">No harvest sources defined yet.</div>
+    </py:otherwise>
+  </py:choose>
+ 
+  </div>
+</div>
+<xi:include href="layout.html" />
+</html>


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/templates/source/edit.html	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,24 @@
+<html xmlns:py="http://genshi.edgewall.org/"
+  xmlns:i18n="http://genshi.edgewall.org/i18n"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  py:strip="">
+
+    <py:def function="page_title">Edit - Harvest Source</py:def>
+
+    <py:def function="body_class">hide-sidebar</py:def>
+    <py:def function="optional_head">
+        <link rel="stylesheet" href="${g.site_url}/css/forms.css" type="text/css" media="screen, print" />
+        <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
+    </py:def>
+
+    <div py:match="content">
+        <div class="harvest-content">
+            <h2>Edit harvest source </h2>
+
+
+            ${h.literal(c.form)}
+
+        </div>
+    </div>
+    <xi:include href="../layout.html" />
+    </html>


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/templates/source/new.html	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,23 @@
+<html xmlns:py="http://genshi.edgewall.org/"
+  xmlns:i18n="http://genshi.edgewall.org/i18n"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  py:strip="">
+
+    <py:def function="page_title">New - Harvest Source</py:def>
+
+    <py:def function="body_class">hide-sidebar</py:def>
+    <py:def function="optional_head">
+        <link rel="stylesheet" href="${g.site_url}/css/forms.css" type="text/css" media="screen, print" />
+        <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
+    </py:def>
+
+    <div py:match="content">
+        <div class="harvest-content">
+            <h2>New harvest source </h2>
+
+            ${h.literal(c.form)}
+
+        </div>
+    </div>
+    <xi:include href="../layout.html" />
+    </html>


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/templates/source/new_source_form.html	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,45 @@
+<form id="source-new" class="ckan" method="post"
+  py:attrs="{'class':'has-errors'} if errors else {}"
+  xmlns:i18n="http://genshi.edgewall.org/i18n"
+  xmlns:py="http://genshi.edgewall.org/"
+  xmlns:xi="http://www.w3.org/2001/XInclude">
+
+  <div class="error-explanation" py:if="error_summary">
+<h2>Errors in form</h2>
+<p>The form contains invalid entries:</p>
+<ul>
+  <li py:for="key, error in error_summary.items()">${"%s: %s" % (key, error)}</li>
+</ul>
+</div>
+
+    <fieldset> 
+    <legend>Details</legend> 
+        <dl> 
+            <dt><label class="field_req" for="url">URL for source of metadata *</label></dt> 
+            <dd><input id="url" name="url" size="80" type="text" value="${data.get('url', '')}" /></dd>
+            <dd class="field_error" py:if="errors.get('url', '')">${errors.get('url', '')}</dd>
+            <dd class="instructions basic">This should include the <tt>http://</tt> part of the URL</dd> 
+            <dt><label class="field_req" for="type">Source Type *</label></dt> 
+            <dd> 
+                <select id="type" name="type">
+                    <py:for each="harvester in harvesters">
+                     <option value="${harvester.name}" py:attrs="{'selected': 'selected' if data.get('type', '') == harvester.name else None}" >${harvester.title}</option>
+                    </py:for>
+                </select>
+            </dd>
+            <dd class="field_error" py:if="errors.get('type', '')">${errors.get('type', '')}</dd>
+            <dd class="instructions basic">Which type of source does the URL above represent?
+                <ul>
+                    <py:for each="harvester in harvesters">
+                    <li><span class="harvester-title">${harvester.title}</span>: ${harvester.description}</li>
+                    </py:for>
+                </ul>         
+            </dd> 
+            <dt><label class="field_opt" for="description">Description</label></dt> 
+            <dd><textarea id="description" name="description" cols="30" rows="2" style="height:75px">${data.get('description', '')}</textarea></dd> 
+            <dd class="instructions basic">You can add your own notes here about what the URL above represents to remind you later.</dd> 
+        </dl> 
+    </fieldset> 
+    <input id="save" name="save" value="Save" type="submit" /> or <a href="/harvest">Return to the harvest sources list</a> 
+
+</form>


--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/ckanext/harvest/templates/source/read.html	Fri May 20 13:50:15 2011 +0100
@@ -0,0 +1,93 @@
+<html xmlns:py="http://genshi.edgewall.org/"
+  xmlns:i18n="http://genshi.edgewall.org/i18n"
+  xmlns:xi="http://www.w3.org/2001/XInclude"
+  py:strip="">
+
+  <py:def function="page_title">Harvest Source Details</py:def>
+
+  <py:def function="optional_head">
+    <link type="text/css" rel="stylesheet" media="all" href="/ckanext/harvest/style.css" />
+  </py:def>
+  
+<div py:match="content">
+  <div class="harvest-content">
+  <py:if test="c.source">
+  <h1>Harvest Source Details</h1>
+    <table id="harvest-source-details">
+        <tr>
+            <th>ID</th>
+            <td>${c.source.id}</td>
+        </tr>
+        <tr>
+            <th>URL</th>
+            <td>${c.source.url}</td>
+        </tr>
+        <tr>
+            <th>Type</th>
+            <td>${c.source.type}</td>
+        </tr>
+        <tr>
+            <th>Active</th>
+            <td>${c.source.active}</td>
+        </tr>
+        <tr>
+            <th>Description</th>
+            <td>${c.source.description}</td>
+        </tr>
+        <tr>
+            <th>User</th>
+            <td>${c.source.user_id}</td>
+        </tr>
+        <tr>
+            <th>Publisher</th>
+            <td>${c.source.publisher_id}</td>
+        </tr>
+        <tr>
+            <th>Created</th>
+            <td>${c.source.created}</td>
+        </tr>
+        <tr>
+            <th>Total jobs</th>
+            <td>${len(c.source.jobs)}</td>
+        </tr>
+        <tr>
+            <th>Status</th>
+            <td>
+                Last Harvest Errors: ${c.source.status.last_harvest_statistics.errors}<br/> 
+                <py:choose>
+                    <py:when test="len(c.source.status.last_harvest_errors)>0">
+                        <ul>
+                        <li py:for="error in c.source.status.last_harvest_errors">${error}</li>
+                        </ul>
+                    </py:when>
+                </py:choose>
+                Last Harvest Added: ${c.source.status.last_harvest_statistics.added}<br/>
+                Last Harvest Updated: ${c.source.status.last_harvest_statistics.updated}<br/>
+                Last Harvest: ${c.source.status.last_harvest_request} <br/>
+                Next Harvest: ${c.source.status.next_harvest}
+            </td>
+        </tr>
+        <tr>
+            <th>Total Errors</th>
+            <td>${c.source.status.overall_statistics.errors}</td>
+        </tr>
+        <tr>
+            <th>Total Packages</th>
+            <td>${c.source.status.overall_statistics.added}</td>
+        </tr>
+        <tr>
+            <th>Packages</th>
+            <td>
+                <div>There could be a 10 minutes delay before these packages (or changes to them) appear on 
+                the site or on search results.</div>
+                <div py:for="package in c.source.status.packages">
+                    <a href="/package/${package}">${package}</a>
+                </div>
+            </td>
+        </tr>
+    </table>
+    </py:if>
+  </div>
+</div>
+<xi:include href="../layout.html" />
+</html>


--- a/pip-requirements.txt	Tue May 17 17:26:42 2011 +0100
+++ b/pip-requirements.txt	Fri May 20 13:50:15 2011 +0100
@@ -3,8 +3,3 @@
 # to suit the packaging system.
 
 carrot==0.10.1
-
-# These are other dependencies to bear in mind:
-
-# -e hg+https://bitbucket.org/okfn/ckanext-dgu@default#egg=ckanext-dgu
-# -e hg+https://bitbucket.org/okfn/ckanext-csw@default#egg=ckanext-csw


--- a/setup.py	Tue May 17 17:26:42 2011 +0100
+++ b/setup.py	Fri May 20 13:50:15 2011 +0100
@@ -6,7 +6,7 @@
 setup(
 	name='ckanext-harvest',
 	version=version,
-	description="CSW harvesting plugin for CKAN",
+	description="Harvesting interface plugin for CKAN",
 	long_description="""\
 	""",
 	classifiers=[], # Get strings from http://pypi.python.org/pypi?%3Aaction=list_classifiers

Repository URL: https://bitbucket.org/okfn/ckanext-harvest/

--

This is a commit notification from bitbucket.org. You are receiving
this because you have the service enabled, addressing the recipient of
this email.