[ckan-dev] key value store, caching and redis

David Raznick kindly at gmail.com
Sun Jan 30 11:53:40 UTC 2011


>
> Rufus said
> It would be worth you elaborating on the attractions of option (3) versus
> (2).
>

I thought I stated my reasons.

>      Atomic operations easy i.e counters, queues
>      Plugins could do many more things, without the need to manage own
> database. i.e there could be a caching plugin, a pubsub plugin, a plugin
> that stored the last 10 packages a user viewed.

However, its best to go through use cases as you said.

* caching plugin  (not ticketed but needed)

    This obviously can only be done through something like redis/memcahched

*  Watch/follow a package extension: http://ckan.org/ticket/936

    This is something that would be bad for a table k/v store and correct
for redis.

    Every time someone follows something the package counter has to change.
This precludes us from caching the package view form as we are not sure if
the counter has changed yet.  So every view we will be hitting the
database.   I would say that the possibility of loosing 10mins worth of
watchers in a disaster (if the sever goes down) is worth the benefit.

Also the kv data structure does not suit this need well, as we will have to
effectively emulate a many to many (from users to packages).

 * Download stats for resources: http://ckan.org/ticket/937

   Same reason as above. We do not want to be hitting the database for every
resource view.
Also, counters in this key value table are horrible as the values are
strings and to make it atomic we would need to manipulate the json in sql.

Config options in WUI extension: http://ckan.org/ticket/277

 *  I do not like this idea at all.  If we were to do this I would not want
it as a plugin and I think it deserves its own table as it being mashed in
with the other data will make it even harder to work out the current config
settings for the sysadmin.

  * Apps extension (not yet ticketed -- allow users to register app ideas
...)

   There are many better platforms than ckan to register app ideas. To do
this properly we would want some form of voting and comments.

 * A simpler queueing system

   I think redis is perfect for this, however this is out of scope of this
conversation.

> Friedrich said
>In this, I'd like to make an argument for mongo, while I'm sure Will
>would want to speak for Virtuoso. Shall we go there?

I would actually like to go there.  However, I still think you would need a
good solution to caching/atomic ops,  regardless of if you chose the above
two for your valuable data.

> Seb said
>As a general point I am no fan of SQL databases

I funny enough am a big fan sql databases.  I just do not like them
abused.   I like the the way the schema gives you an implicit model of your
data, that its got rock solid durability and that they can be queried easily
with a well established standard.  I think this is very important for
valuable data.

The two questions for me are.

1. Will this increase complexity of the system or simplify it?

For me it simplifies it.  Redis is no harder to set up than say memcached.
Its *much* easier than something like rabbinmq.

2.  Do we need a new solution to caching or storing semi-valuable data in a
fast way?

I think we do.  I do not see this as a new database, I see it as memcached
with some persistence.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20110130/da593ca7/attachment-0001.html>


More information about the ckan-dev mailing list