[ckan-dev] harvesting and ckan geo extensions

William Waites ww at styx.org
Wed Apr 6 10:14:57 UTC 2011


Sorry really spaces - odd that itseemed to work n the test environment.

Re API URL I'll pull where i saw that out later when I'm at a computer.

Re relative URLs this is actually an error. But if you need it to avoid fines...


Re other improvements I'm looking at this from the pov of a demo to the ec so priorities are slightly different. I would rather build on Adria 's work but if you are now saying wait a month then we'll have to think of something new to do. No problem I'm good at thinking of things to do.

Anyways these were comments on how the system could be improved not dictats of what must be done now


James Gardner <james at 3aims.com> a écrit :

>Hi all,
>
>On 06/04/11 10:21, David Read wrote:
>> 2011/4/6 Adrià Mercader<amercadero at gmail.com>:
>>> Hi William and others,
>>>
>>> El 5 d’abril de 2011 23:14, William Waites<ww at styx.org>  ha escrit:
>>>> So far so good, I added some tweaks to the documentation for how to
>>>> configure the plugins, and reminders to actually install the package
>>>> and put the requirements in the setup.py so that they get installed
>>>> comme il faut.
>>> Aren't plugins supposed to be separated by spaces in the ini file, not commas?
>>>     ckan.plugins = cswserver dgu_form_api harvest
>>> instead of
>>>     ckan.plugins = cswserver,dgu_form_api,harvest
>
>Yes, we use spaces Will.
>
>>>> In the instructions for the API URL, I may be wrong but I believe that
>>>> the convention for ckanclient is to put http://exmaple.org/api not
>>>> just http://example.org/ if I am not wrong about this it should
>>>> probably be changed for consistency's sake.
>>> Not exactly sure what you mean. Do you mean the default value in ckan.api_url?
>>> In any case, this will probably get removed in the current refactoring
>>> as we won't need
>>> to query the api
>> Hmm the docs have an example:
>>
>> ckan.api_url=http://scotdata.ckan.net/api
>>
>> So maybe Will is talking about a different setting than ckan.api_url?
>
>We want relative URLs here to support DGU temporarily. Please, please, 
>please don't go changing things like this at this stage. Once the code 
>is stable and deployed for DGU we can look at improvements for wider 
>use, but that certainly isn't our main concern 4 weeks before a major 
>deadline.
>
>>>> I notice the harvester is adding resources with relative URLs, this
>>>> seems to be a pre-existing bug not least because it prevents the
>>>> package from being edited because those fields fail validation.
>>>> The authentication arrangement in view.py should probably be slackened
>>>> a bit, since I don't think we need admin privileges to be able to just
>>>> look at a map.
>>> Agreed. I think it's made this way in the current context of DGU, but
>>> when this gets
>>> moved to a new ckanext-georelatedstuff I don't think it will be necessary
>> I don't know much about harvesting and its direction, but I put the
>> sysadmin authz requirement on all the harvest view interface and
>> matching API calls purely because it was easy, it works for DGU right
>> now, and it prompts someone to properly plan what authz we do need.
>> This might well be a protection object for a harvest source or doc,
>> along the lines of everything else. So I think rather than make a
>> piecemeal change to remove completely authz on one particular call,
>> someone should argue the whole harvesting/geo/csw thing together, in
>> light of where it's going.
>
>Again, Adria has implemented exactly what I've asked which is what we 
>need to support at the moment, sysadmin privileges are correct. We can 
>look at changing after the deadline.
>
>>>> Also if there is a viewable resource, probably we
>>>> should have a smaller map without controls on the main package page,
>>>> though I understand why you didn't do this straight away as it is a
>>>> more invasive template change.
>>> That would be nice, but it's a little bit trickier. When dealing with
>>> arbitrary WMS servers is very difficult to get a representative
>>> snapshot of the
>>> maps behind it. In most cases, the user will need to zoom in or out to
>>> actually see the maps in context.
>>>
>
>Out of scope for the time-being, let's revisit in mid-May.
>
>>>> On the treatment of the SRS in the extras field. I don't really know
>>>> why we are putting a big blob of XML in there instead of just using
>>>> the well known string identifier in. I think this might be tripping up
>>>> the indexing of some datasets, particularly as the UK often uses its
>>>> own national grid system very often. There are no particular test for
>>>> this, I'll write some once we get some consensus about if we are going
>>>> to actually put the SRID in the SRID field or keep the XML blob.
>>> I modified that parser a while ago to store the SRID, not the XML. If
>>> it's not doing it, it's a bug:
>>> https://bitbucket.org/okfn/ckanext-harvest/src/1dd85319a6bf/ckanext/harvest/model/__init__.py#cl-359
>>>
>>>
>>>> On the treatment of the bounding box, I mention this here because I
>>>> know that Friedrich and I had discussed this a while back. Probably
>>>> having a separate extra for each of the coordinates of the corners, or
>>>> 4 extras in all is not as good as having just one BBOX extra. Better
>>>> still might be to have an "envelope" extra with WKT in it.
>>> +1 To have just one extra field. Currently it just uses the existing
>>> code used to parse GEMINI records
>
>Fine with me, but not exactly a high priority. The bounding box is 
>always lat/long and the WMSs we need to support are all ETRS 89.
>
>>>> Back to the cosmetic front, it probably would be a good idea to put in
>>>> a base layer of vmap0 or something to aid in orientation.
>>> It will definitely help. vmap0 is not the nicest base map around, but
>>> it's the only one I know in WGS84 (4326) which has a global coverage.
>
>Please, this isn't a priority. Yes, we should look at all these 
>improvements, but not until everything we need to deliver for DGU is 
>perfect. There's a risk that "improvements" could actually interfere 
>since DGU is not exactly a standard case, but its the one we need 
>working first.
>
>>>> I guess the geo search and handling of envelope/bbox extras should
>>>> really be in a ckanext-geo and not in harvesting since it has nothign
>>>> to do with harvesting really, nor with CSW or DGU. That way anything
>>>> with that extra would get indexed and displayed.
>>> We are indeed planning to move the GEMINI stuff to ckanext-inspire and
>>> the spatial search and wms preview to ckanext-geo
>>>
>>>
>>>> All in all, very promising.
>>>>
>>> Please bear in mind that it's just a preliminary version :)
>+1
>>> I think that after the refactoring, when all things are where they are
>>> supposed to be we will be able to polish all of this details
>
>Exactly, all in good time ;) sorry to put a damner on things but I want 
>us to focus on the correct things first, which at the moment is the 
>harvesting re-factor to support pluggable harvesting backends and the queue.
>
>Cheers,
>
>James
>


More information about the ckan-dev mailing list