[ckan-dev] key value store, caching and redis

David Raznick kindly at gmail.com
Sun Jan 30 21:17:48 UTC 2011


On Sun, Jan 30, 2011 at 4:00 PM, Rufus Pollock <rufus.pollock at okfn.org>wrote:

> On 30 January 2011 11:53, David Raznick <kindly at gmail.com> wrote:
> >> Rufus said
> >> It would be worth you elaborating on the attractions of option (3)
> versus
> >> (2).
> >
> > I thought I stated my reasons.
>
> Yes, sorry :) What I meant was you'd put forward a great summary of
> pro/cons but for me it wasn't clear from the pro/cons of individual
> options why (3) was better than (2). (E.g. ability to store last 10
> packages user saw could be done in (2)). Reading again I think the key
> point (in your mind) for redis over sql k/v is:
>
> 1. Built for the task (and more flexible)
> 2. Very fast (so can use it for more)
>
> The only downside is introducing another moving part to our system
> (which, you might argue, is not true because we'll need redis for
> caching anyway).
>
> I'll respond some below on specific points but to summarize: you have
> me convinced :)
>
> I'm going to start trying out a watch plugin with redis as soon as I
> have a spare moment to see how it goes. (NB: This will only be
> deployed against testing or ckan.net to start with and will be an
> orthogonal system so if it doesn't work it will be very easy to 'back
> out'.)
>
> Rufus (more minor comments below)
>
> >>      Atomic operations easy i.e counters, queues
> >>      Plugins could do many more things, without the need to manage own
> >> database. i.e there could be a caching plugin, a pubsub plugin, a plugin
> >> that stored the last 10 packages a user viewed.
> >
> > However, its best to go through use cases as you said.
> >
> > * caching plugin  (not ticketed but needed)
> >
> >     This obviously can only be done through something like
> redis/memcahched
>
> As opposed to cache system provided by pylons out of the box (atm we
> use straightforward dbm on disk but can switch to memcached). I think
> key point for redis versus current caching would be speed and
> flexibility ...
>
> > *  Watch/follow a package extension: http://ckan.org/ticket/936
> >
> >     This is something that would be bad for a table k/v store and correct
> > for redis.
> >
> >     Every time someone follows something the package counter has to
> change.
> > This precludes us from caching the package view form as we are not sure
> if
> > the counter has changed yet.  So every view we will be hitting the
> > database.   I would say that the possibility of loosing 10mins worth of
> > watchers in a disaster (if the sever goes down) is worth the benefit.
> >
> > Also the kv data structure does not suit this need well, as we will have
> to
> > effectively emulate a many to many (from users to packages).
> >
> >  * Download stats for resources: http://ckan.org/ticket/937
> >
> >    Same reason as above. We do not want to be hitting the database for
> every
> > resource view.
> > Also, counters in this key value table are horrible as the values are
> > strings and to make it atomic we would need to manipulate the json in
> sql.
>
> Convinced on both scores (and I did think these were natural for k/v
> and especially redis). My only question is whether would want to flush
> this information somewhere more permanent that redis or whether redis'
> durability is now 'good enough'.
>

We would actually likely is the 'append only file' option.  This fsyncs data
to disk in a log every second (or an interval of your choosing).  This is
designed for durability. We probably do not do enough writes to make this a
performance bottleneck.   So that would mean only loosing at most a seconds
worth of data. The other option is to do snapshotting every so often based
or based how many writes have happened.

I have not heard any reports of it having inconsistent data, it has a tool
for recovering aof corruptions which can happen.

Its methodologies are clearly stated and I trust it enough to not worry
about it fragrantly loosing data.  We should do backups of course too...


>
> > Config options in WUI extension: http://ckan.org/ticket/277
> >
> >  *  I do not like this idea at all.  If we were to do this I would not
> want
> > it as a plugin and I think it deserves its own table as it being mashed
> in
> > with the other data will make it even harder to work out the current
> config
> > settings for the sysadmin.
>
> I agree. Please add this to the ticket.
>
> >   * Apps extension (not yet ticketed -- allow users to register app ideas
> > ...)
> >
> >    There are many better platforms than ckan to register app ideas. To do
> > this properly we would want some form of voting and comments.
>
> Maybe but people always want to see these integrated and that can be
> hard with external systems. Anyway, let's ticket the requirement and
> then add suggestions for solutions there: <http://ckan.org/ticket/941>
>
> >  * A simpler queueing system
> >
> >    I think redis is perfect for this, however this is out of scope of
> this
> > conversation.
>
> Yes :)
>
> [...]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20110130/b5c1977b/attachment-0001.html>


More information about the ckan-dev mailing list