[ckan-dev] facets usage

David Read david.read at hackneyworkshop.com
Fri Apr 29 09:18:08 UTC 2016


I thought I'd just share some usage data on our search facets.
Filtering by file format and publisher/organization were popular as
expected, but we were surprised to see that the most used one was
'theme', which is not a standard CKAN feature, and maybe should be.

We automatically categorize datasets into a theme, which we found much
more consistent that asking users to do it. We used some supervised
learning techniques, based on selecting and picking out words and
phrases. The set-up was manual, and needs only occasional tweaking
when datasets are added with topics that don't get categorized or get
mis-categorized. I can point at our code, but there's a fair gap to
make it user friendly. I think the main thing we offer is as an
example. No doubt implementing a version for core CKAN would be an
interesting project for someone.

Dave

https://data.gov.uk/data/search

last month's unique pageviews (ordered the same as they are offered to the user)
All searches 95464
Filter: unpublished 8270
Filter: NII 1778
Filter: API 1133
Filter: License 4578
Filter: theme 19219
Filter: format 13836
Filter: publisher 11375
Filter: schema 345
Filter: code list 68
Filter: openness score 1648
Filter: broken links 1021
Filter: UKLP 3421



More information about the ckan-dev mailing list