[ckan-dev] key value store, caching and redis
Rufus Pollock
rufus.pollock at okfn.org
Sun Jan 30 16:00:22 UTC 2011
On 30 January 2011 11:53, David Raznick <kindly at gmail.com> wrote:
>> Rufus said
>> It would be worth you elaborating on the attractions of option (3) versus
>> (2).
>
> I thought I stated my reasons.
Yes, sorry :) What I meant was you'd put forward a great summary of
pro/cons but for me it wasn't clear from the pro/cons of individual
options why (3) was better than (2). (E.g. ability to store last 10
packages user saw could be done in (2)). Reading again I think the key
point (in your mind) for redis over sql k/v is:
1. Built for the task (and more flexible)
2. Very fast (so can use it for more)
The only downside is introducing another moving part to our system
(which, you might argue, is not true because we'll need redis for
caching anyway).
I'll respond some below on specific points but to summarize: you have
me convinced :)
I'm going to start trying out a watch plugin with redis as soon as I
have a spare moment to see how it goes. (NB: This will only be
deployed against testing or ckan.net to start with and will be an
orthogonal system so if it doesn't work it will be very easy to 'back
out'.)
Rufus (more minor comments below)
>> Atomic operations easy i.e counters, queues
>> Plugins could do many more things, without the need to manage own
>> database. i.e there could be a caching plugin, a pubsub plugin, a plugin
>> that stored the last 10 packages a user viewed.
>
> However, its best to go through use cases as you said.
>
> * caching plugin (not ticketed but needed)
>
> This obviously can only be done through something like redis/memcahched
As opposed to cache system provided by pylons out of the box (atm we
use straightforward dbm on disk but can switch to memcached). I think
key point for redis versus current caching would be speed and
flexibility ...
> * Watch/follow a package extension: http://ckan.org/ticket/936
>
> This is something that would be bad for a table k/v store and correct
> for redis.
>
> Every time someone follows something the package counter has to change.
> This precludes us from caching the package view form as we are not sure if
> the counter has changed yet. So every view we will be hitting the
> database. I would say that the possibility of loosing 10mins worth of
> watchers in a disaster (if the sever goes down) is worth the benefit.
>
> Also the kv data structure does not suit this need well, as we will have to
> effectively emulate a many to many (from users to packages).
>
> * Download stats for resources: http://ckan.org/ticket/937
>
> Same reason as above. We do not want to be hitting the database for every
> resource view.
> Also, counters in this key value table are horrible as the values are
> strings and to make it atomic we would need to manipulate the json in sql.
Convinced on both scores (and I did think these were natural for k/v
and especially redis). My only question is whether would want to flush
this information somewhere more permanent that redis or whether redis'
durability is now 'good enough'.
> Config options in WUI extension: http://ckan.org/ticket/277
>
> * I do not like this idea at all. If we were to do this I would not want
> it as a plugin and I think it deserves its own table as it being mashed in
> with the other data will make it even harder to work out the current config
> settings for the sysadmin.
I agree. Please add this to the ticket.
> * Apps extension (not yet ticketed -- allow users to register app ideas
> ...)
>
> There are many better platforms than ckan to register app ideas. To do
> this properly we would want some form of voting and comments.
Maybe but people always want to see these integrated and that can be
hard with external systems. Anyway, let's ticket the requirement and
then add suggestions for solutions there: <http://ckan.org/ticket/941>
> * A simpler queueing system
>
> I think redis is perfect for this, however this is out of scope of this
> conversation.
Yes :)
[...]
More information about the ckan-dev
mailing list