[ckan-discuss] Tag_count per group / packagecount in grouplist

Friedrich Lindenberg friedrich at pudo.org
Wed Oct 13 17:22:42 BST 2010


Hey all, 

On Oct 13, 2010, at 11:12 AM, John Bywater wrote:
>>> Is it possible to get a tag-list, with count, for a specific group, using the API?
>>> For example, we get the packagelist for Delft, using $locator/api/2/search/package?groups=gemeente_delft&all_fields=1 
>>> To show a taglist alongside this resultset we've only came up with a pretty inefficient solution; Using the received JSON we count the tags and build the list by examining each dataset. For the catalog as a whole we can use $locator/api/2/tag_counts. Is there some way to use this method as well for a specific group, e.g. $locator/api/2/search/package?groups=gemeente_delft&tag_counts=1 ?
>>> 
>>> The same issue we encounter when building a grouplist. It's no problem at all to show a grouplist. We want to show a packagecount per group as well. The only way to achieve this seems to be to request a jsonfile per group in the background, and extract the packagecount. When you paginate per 30 items, this means 30 requests to the CKAN API. This seems very inefficient to us and perhaps you know a better solution?
>> As for as I'm aware there is no simple way to gather the counts you're looking for at the moment. It seems to me like what you're trying to build is very similar to faceted browing. The numbers you're asking about would then relate first to the entire system and later to the result set of a specific search query (i.e. a group:foo query). We're planning to have a first go at putting this into the main CKAN this weekend (there's already a prototype implementation on iati.ckan.net), and while I have not considered this before, it would make perfect sense to include that data into the REST API (v.2).
>> So if there's no protest from anyone, I'll try to implement this both in the WUI and in the REST API in a way that is non-intrusive to the current API. To find out how we can best help you: - is it still ok for you to have this available in production in 2-3 weeks? - would you be willing to help us spec this out as a "test customer"?
> 
> To me, it looks like a list comprehension, and the trouble is that the there are lots of calls (in this case, 30 calls) to the server. I wonder whether 30 calls would still be a problem if each took less than 5 ms.

5ms in-app + 50ms wire * 30 = yes, I think. Its OK for an internal call, but not as an API thing to be called on page generation (And, since this is a navigation thing, caching and off-line generation would seem tricky). 

> But in any case, it might be nice to pass a list comprehension to the server, and have the results returned in one call. So I wondered whether a "List Comprehension API" would make sense?

I don't get this - are you proposing to allow the client to pass actual code to the server (or a sufficiently large set of parameters for this to make sense)? That would not make things very RESTish, I think - but I'm eager to hear more about what you are proposing! 

> That's not to argue against faceted search. Indeed, let's establish requirements before shaping up a solution. :-)

As an intro to the kind of data that facets expose, let me point you at http://wiki.apache.org/solr/SimpleFacetParameters#Facet_Fields - this to me looks very much like the thing Martin is asking about - but I might be rorschaching there ;-) 

Friedrich 


More information about the ckan-discuss mailing list