[ckan-dev] moderated edits c(r)ep. http://trac.ckan.org/ticket/1129

Rufus Pollock rufus.pollock at okfn.org
Mon May 9 08:35:54 UTC 2011


On 9 May 2011 00:44, David Raznick <kindly at gmail.com> wrote:
>> I'm personally of the view that this fits very naturally with the
>> proposed new 'changeset' vdm model. In that model it is natural and
>> relatively easy to have changesets that have been 'created' but not
>> 'applied' to the 'working copy' (i.e. continuity objects).
>
> I do not see how this would make any difference in this case.   See below
> for my explanation of this.
>>
>> This is more along the lines of your option 1.
>>
>> You don't seem to favour this :-)
>
>
> I am on the fence to be honest.  Proposal 1 is quick and dirty and will do
> mostly what we need if we are prepared to throw away all our 'pending'
> changes when we change our schema and if we do not care about looking at
> historical changes.   I do think that proposal 1 should be separate from vdm
> entirely as they do different things.

Why does it prevent us looking at historical changes -- I don't understand ...

> I will try and explain were I am coming from with an example.  Say we have
> simplified package dict. with a many2many relationship with resources.
>
> {'name': u'anna2',
> 'id': u'afafaff',
> 'resources': [{'id': u'fafafaff',
>                'url': u'http://www.annakarenina.com/'},
>               {'id': u'fafafafa',
>                'url': u'http://www.annakarenina.com/index.json'}]
> }
>
> Say somebody changes a resource, but leaves the package intact, so changes
> it to.
>
> {'name': u'anna2',
> 'id': u'afafaff',
> 'resources': [{'id': u'fafafaff',
>                'url': u'http://differenturl'},
>               {'id': u'fafafafa',
>                'url': u'http://www.annakarenina.com/index.json'}]
> }
>
> Both the new and old vdm will store changes to *just* the resource. There is
> no getting round that unless the resource has a way of signalling the
> package to make a whole new package dict to store.  This is my proposal 1,
> to manually do this signalling selectively in the logic layer.  vdm cannot
> guess what to signal nor shouldn't.
> In my opinion our revisioning system (vdm) should not store package dicts
> like that, as they are fragile to change.  It should only store changes to
> individual tables like we do currently do.

Right, that's agreed :-) (I wasn't proposing any different ...)

> To prove this, say also at one point you try and look at the data the other
> way round, and you consider resources the primary object. You want the
> resource dict to look like.
>
> {'id': u'fafafaff',
>  'url': u'http://differenturl'
> 'packages': [{'name': u'anna2',
>              'id': u'afafaff'}
>              ]
> }
> If we have not pre-emptively stored the resource dict like this, then to
> reproduce it will be very very hard if all we have is the above package
> dicts.  It will be a lot easier to reproduce this if we make sure we store
> each table separately.
>
> Proposal 2 gives us a way of producing these dicts historically for any way
> we decide to look at the data.  The new changeset model in vdm makes that
> hard as we will need to join on keys contained in the
> change_object_dicts, it is much nicer and faster if the data is in an
> indexable table like it currently is.

No that's not true. The new changeset model has a changeobject table
which has as a column (and hence indexable) a 'primary key' for that
object (to be absolutely correctly it is a munge of: object_type (e.g.
package) and original primary key for that object -- we could split
those out in our implementation if we wished). As such looking up
changeobjects is no different from looking up in revision tables as we
would currently do.

> If you have a new proposal then these two offered and uses the changeset
> model then please add it to the c(r)ep :)

Will do but basically I'm just trying to clarify the pros and cons
around 1 vs 2 and their relationship to vdm :-)

Rufus




More information about the ckan-dev mailing list