[ckan-dev] Master's thesis and personalization of CKAN

Sean Hammond sean.hammond at okfn.org
Tue May 15 09:28:12 UTC 2012


Hi Sven,

> Hello everyone,
> 
> I'd like to introduce myself to the CKAN developer team. I am Sven
> and study computer science (specialization in the field of data and
> web engineering) at the Chemnitz University of Technology. I decided
> to write my Master's thesis under supervision of Dr. Sören Auer who
> introduced me to CKAN.
> 
> The topic of the thesis is "A concept to personalize dataset
> repositories using the example of CKAN". Thus, I'd like to add
> functionality to enable the users to follow their personal interests
> more easily and faster; regarding data repositories to follow the
> data which they are primarily interested in.
>
> One step towards that is the already implemented extension to follow
> selected datasets (Irina told us that Darwin is in charge of it).
> Maybe that's the first point for me to hook into.
> 
> Here a short summary of the thoughts we'd like to implement:
> 
> active parts for the users; their efforts:
> - to follow a group of datasets
> - to follow a set of datasets defined by search terms/tags/other
> meta-data like LODStats
> - to follow people << maybe friends
> [- to follow a dataset as the base functionality]
> 
> - to configure whether or not to be informed and if so which media
> shall be used for which kind of changes/"follows"/etc.
> 
> passive parts for the users; their benefits:
> - being notified by email in case of changes
> - being informed by an activity stream on the homepage right after
> logging in to CKAN (bringing the activity stream to the fore in
> order to provide an on-the-glance view of the users' latest news)
> 
> Currently, I am not aware of which particular feature has been
> already implemented or not.

Great! This sounds really exciting. I'll try to provide some info about
what we've implemented so far in CKAN to get you started.

CKAN currently has "activity streams" on user pages, that show
everything the user has been doing on the site. Example:

http://thedatahub.org/user/rufuspollock

In the code, we have also implemented activity streams for datasets,
groups, tags and one for all the activity on the entire site, although
we currently only use the user activity streams on thedatahub. We've
implemented a framework that makes it easy to add new activity streams,
add new types of activity, or add activity streams into pages. (If you'd
like to know more about the implementation, just ask.)

In the current development version of CKAN (1.8a) we have implemented
"Follow" buttons, a "Followers" count and "Followers" pages for users
and datasets. You can go to the page of a user/dataset and click Follow,
you will now be following that user/dataset, its follower count will
increase by 1 and your name will appear on its followers page. You can
click Unfollow to stop following something. Behind the scenes, there is
an API for getting the number or a list of followers for a user/dataset,
asking whether you are following a user/dataset, and
following/unfollowing a user/dataset. The code for this is currently in
branch feature-2304-follow but it'll be merged into master very soon.

Next we want to show a combined activity stream with all the activities
from all the users and datasets that you're following, and show this to
you when you login. This new activity stream should be quite easy to add
on top of the activity streams and followers stuff that we've already
implemented. Let me know if you'd be interested in implementing this
feature, I think it will be a nice, fairly small task as an introduction
to CKAN development and I could give you some guidance on how to
implement it and review your code for you.

Here are tickets with more information about the Follow button and the
combined activity stream (user stories, technical analysis, links to
code implemented so far):

http://trac.ckan.org/ticket/2304
http://trac.ckan.org/ticket/2305

We have not yet implemented my kind of email notification, but that of
course could be added on top of the new activity stream, and we have
done some analysis on how it could be implemented. Here's the ticket for
it with more info:

http://trac.ckan.org/ticket/1635

We haven't implemented any support for following sets of users/datasets
defined by search terms and as far as I know this isn't something we've
considered.

If you're looking for more ideas for activity streams-related features
that could be added to CKAN, we have a bunch of tickets with the keyword
"activity_streams":

http://trac.ckan.org/query?status=accepted&status=assigned&status=new&status=reopened&order=priority&col=id&col=summary&col=status&col=type&col=priority&col=milestone&col=component&keywords=~activity_streams




More information about the ckan-dev mailing list