[ckan-dev] moderated edits c(r)ep. http://trac.ckan.org/ticket/1129

Rufus Pollock rufus.pollock at okfn.org
Thu May 12 14:57:52 UTC 2011


On 9 May 2011 10:19, David Raznick <kindly at gmail.com> wrote:

More thread on vdm on okfn-help last Autumn that may be useful :-)

<http://lists.okfn.org/pipermail/okfn-help/2010-October/000884.html>

One big feeling I have on this is that we should resolve #1077 as a
pre-requisite for this. I've just done a *big* revision of #1077 and
created tickets detailing proposed changes to vdm. I think you may now
agree with the conclusion that is there :-)

> On Mon, May 9, 2011 at 9:35 AM, Rufus Pollock <rufus.pollock at okfn.org>
> wrote:
[..]
>> Why does it prevent us looking at historical changes -- I don't understand
>> ...
>
> It prevents us from looking at historical changes in the dictized format. We
> could still look at historical change in vdm of course.

Yes you're absolutely right that it means we can't load into our
current structure and so could only do things like 'diffs'. That said
you *could* do this if you had migrated the old changeobject dicts
when you upgraded (just as you will have to do in the alternative
model).

[...]

>> > Proposal 2 gives us a way of producing these dicts historically for any
>> > way
>> > we decide to look at the data.  The new changeset model in vdm makes
>> > that
>> > hard as we will need to join on keys contained in the
>> > change_object_dicts, it is much nicer and faster if the data is in an
>> > indexable table like it currently is.
>>
>> No that's not true. The new changeset model has a changeobject table
>> which has as a column (and hence indexable) a 'primary key' for that
>> object (to be absolutely correctly it is a munge of: object_type (e.g.
>> package) and original primary key for that object -- we could split
>> those out in our implementation if we wished). As such looking up
>> changeobjects is no different from looking up in revision tables as we
>> would currently do.
>
> The table does not have an index on the foreign keys contained in the
> change_object_dict though.  In my tests of the different query strategies I
> had to add an index on the package_id foreign key in the
> package_extras_revision table or the query took much longer.

Yes, that's definitely true.

> The changset model would work with the following provisos.
>
> * No foreign key can change.

Why? (not disagreeing just not sure i understand)

> * We are willing to do an extra join for each relationship we have. i.e the
> join will firstly be to continuity object and then back to the change_object
> table.
> * No continuity object can get deleted.

This I don't understand so again please clarify. I'd assume the join
would be lazy rather than an FK relationship (I think we *need* to
remove FKs from revision objects / changeobjects to continuity
whatever route we go down).

> These I admit are not too unreasonable.  However, the querying due to the
> extra join is more cumbersome.  It would most likely be slower too, due to
> extra join and because you would be always joining back to the *very large*
> change_object table.

Yes, though one that was indexed.

Rufus




More information about the ckan-dev mailing list