[ckan-dev] key value store, caching and redis

Seb Bacon seb.bacon at gmail.com
Sun Jan 30 11:13:22 UTC 2011


My preference would also be for a good, old fashioned,
make-the-SQL-table-you-need-as-you-go approach, for the reasons Rufus
gives, specifically:

 - I don't see that differing schemas will be an issue as plugins
should be loosely coupled to the core -- when this isn't possible, it
suggests to me that our API needs refactoring
 - I like the general principle of keeping our software dependency set
small, although this obviously shouldn't and couldn't become an
absolute rule.

As a general point I am no fan of SQL databases as I think in our
webby world we rarely need ACID compliance, and denormalised table
spaces rarely map well onto real world problem domains, even with a
decent ORM.  But as we are using a SQL database anyway, I don't see a
compelling reason to stop doing so.

Having said that, like Rufus I am open to persuasion :)

Seb

On 29 January 2011 23:12, Rufus Pollock <rufus.pollock at okfn.org> wrote:
> On 28 January 2011 23:42, David Raznick <kindly at gmail.com> wrote:
>> Hello
>>
>> There is a need for ckan plugins to have a place to store things.
>> http://ckan.org/ticket/934
>
> Useful to talk about some simple use cases:
>
> Watch/follow a package extension: http://ckan.org/ticket/936
> Download stats for resources: http://ckan.org/ticket/937
> Config options in WUI extension: http://ckan.org/ticket/277
> Apps extension (not yet ticketed -- allow users to register app ideas ...)
>
> To summarize my views:
>
> * 80/20 solution is very attractive at this early stage (there will
> always be a version 2 if this turns out to be useful!)
> * for simple cases the key/value setup probably buys us most of the
> mileage we need. I would vote for sql option since uses existing tech
> but could be persuaded.
> * for complex cases option (1) (own tables) could be used though worth
> seeing what one could do with (2) (sql k/v)
>
>> There are 3 ideas on how to achieve this:
>>
>>  1.  Let plugins make their own sql tables.
>>
>> Pros.
>>      Greatest flexibility.   Give power over completely to the plugin.
>> Cons
>>      Migration issues.  Different instances will have different schemas.
>> Will definitely need lot of manual work every db upgrade.
>>      There is nothing to stop a plugin from doing this already in its own
>> database/key value store of choosing.  It will have to handle its own db
>> upgrades.
>
> Since connection of plugin into core schema should be pretty limited
> (e.g. just to point to package ids) do not think differing schemas
> will be an issue. For a complex extension I think this is quite an
> attractive option.
>
>>  2.  Make a key value table.  This table will have essentially 3 columns.
>> Namespace, key, value.  The value being a serialised json object.  The
>> namespace will denote what plugin owns that particular row.
>
> Probably, as Friedrich says: namespace, obj_id, key, value
>
>> Pros
>>      Flexible enough for most needs.
>>      Simple to make.
>> Cons
>>      Serializing json in dbs is not great practice.
>>      Data would be messy to handle
>
> But the point here is we would not be doing any complex querying on
> value field so not sure data messiness is a problem.
>
>> 3.  Use redis as key value store.  Keys can have their own namespace above.
>> This can be optional as a config setting (but obviously needed if a plugin
>> required it)
>>
>> Pros
>>      Simple.
>>      Data store suited to task.
>>      Everything it did would be fast.
>>      Atomic operations easy i.e counters, queues
>>      Plugins could do many more things, without the need to manage own
>> database. i.e there could be a caching plugin, a pubsub plugin, a plugin
>> that stored the last 10 packages a user viewed.
>> Cons
>>     All stored in memory
>>     Another daemon process to run.  (even though Ubuntu has an upto date
>> version in its repositories)
>>     If used for caching and persistent data at the same time we will have to
>> deal with durability/speed compromises. see
>> http://redis.io/topics/persistence
>>
>> I personally would go with redis.  I think we need a rethink how we do
>> caching at the moment and this could be the way to do it.
>
> Real question is: does cost of introducing a new component (from an
> install complexity, and dev complexity) worth the benefits that redis
> brings over doing key value in database.
>
> While true that if we use redis for caching we are already requiring
> it and so not another sysadmin requirement still have dev complexity
> (extension authors need to know about redis) and it seems caching
> versus 'real db' requirements on redis are somewhat different (as you
> mention).
>
> It would be worth you elaborating on the attractions of option (3) versus (2).
>
> Rufus
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev




More information about the ckan-dev mailing list