[ckan-dev] moderated edits c(r)ep. http://trac.ckan.org/ticket/1129

David Raznick kindly at gmail.com
Fri May 13 12:59:38 UTC 2011


On Fri, May 13, 2011 at 11:02 AM, Rufus Pollock <rufus.pollock at okfn.org>wrote:

> On 13 May 2011 01:32, David Raznick <kindly at gmail.com> wrote:
> > Firstly, the plan on pending changes should be simpler that what was
> > outlined in that thread.   4 states seem adequate.
>
> Slightly messed up: should have sent start of that part of the thread:
>
> <http://lists.okfn.org/pipermail/okfn-help/2010-October/000871.html>
>
> Also this wasn't meant to be prescriptive (ie. let's do it like this)
> but here's some things we thought about before and issues we came up
> with.
>
> > * 'active'  (current approved)
> > * 'pending-approval'  (changes that have not been looked at by the
> > moderator)
> > * 'completed-approval' (changes that have been looked at by the
> moderator)
> > * 'deleted'
> >
> > Only the moserator/sysadmin can delete.   The moderators can go through
> the
> > pending changes and apply them as they see fit.  When they apply them the
> > revision gets marked as completed-approval and a new active rows are
> made.
>
> New active row where?
>
The new active row will be on the continuity and in the revision table.  It
will be in the continuity table till someone makes another edit.


>
> > The continuity object just holds the latest change independent of the
> state
> > above.  This may cause some difficulty to easily query the database for
> > 'active' revisions but *most* of the time we would hope that the
> community
> > edits would be accepted, so it seems that the latest edit in the
> continuity
> > table makes sense.
>
> Why do this rather than having current active revision? It would seem
> more natural for continuity to act like the continuity?
>

I think it precisely acts like the continuity should act.  It is just the
object that *all* the changes go through regardless of state.


> To parse that I think that means you agree with ticket 1137 but want
> limited version. The question I still think we need to clarify is why
> keep state in main continuity objects?


Personally I would be happy to remove continuity objects entirely and just
have the revision tables.


> If it were removed we would
> have a very simple model where revisionining had no impact on your
> normal tables. Of course, one answer is that we do need it for pending
> stuff if we go with option (b).
>

> >> The changset model would work with the following provisos.
> >>
> >> * No foreign key can change.
> >
> >> Why? (not disagreeing just not sure i understand)
> >
> > No foreign key can change if you want a way to query the revision history
> > easily.  You will have to join on a column in the change_object table.
> See
> > queries below.
>
> I still don't understand here. Which FKs (all of them) and do you mean
> FK or the PK pointed to by a FK or ....? I don't know whether this is
> important but I don't understand :)
>

Given the example below the resouce_revision table has 2 foreign keys.
These foreign keys would be in the change_object table.  How do we get out
all the resources at a particular time that are related to a package?

I *really* think you want the *active* item 'cached' on the continuity
> not the latest edit -- most operations are going to be read operations
> ...
>

I understand that, and I put that in the disadvantages and why it was
necessary to some benchmarking.  The most common reads should be cached in
other ways anyway.


>
> > ***** All the unapproved edits. *****
> >
> > We will need a list of all the unapproved edits to do this we need the
> > distinct revisions affecting all the tables.  We will get out these
> > revisions like we do currently. i.e
> > QUERY 3
> >    select revision_id, timestamp from package_revision where state =
> > 'pending-approval' where id = ' fdsfs'
> >    union
> >    select revision_id, timestamp from package_resource_revision where
> state
> > = 'pending-approval' and package_id = ' fdsfs'
> >    union
> >    select revision_id, timestamp from package_resource_revision join
> > resource_revision on package_resource_revision.resource_id =
> > resource_revision.id where resource_revision.state = 'pending-approval'
> and
> > package_id = ' fdsfs'
> >
> > This sorted by timestamp gives us our moderater queue. For each item in
> the
> > queue we will need to get out the package like so.
> > QUERY 4
> >    select distinct on (id) id, timestamp, name, ... from
> >    package_revision where id = 'fdsfs' where timestamp <=
> > timestamp_got_above  order by timestamp desc
> >
> > and to get out the resources for each timestamp we can do the same QUERY
> 2
> > above.
> >
> > ***** A point in history/time or a particular revision. *****
> >
> > This is essentially covered by QUERY 4 and QUERY 2 above at a particular
> > time.
>
> An immediate question I have here is why aren't we joining on the
> revision table.

I stated why on my original email.

"I will also assume the timestamp is cached on all the revision tables for
the sake of not writing the join to the revision table each time"


> People should approve revisions surely (with all
> changes associated to that revision)  rather than individual parts of
> a revision. A Revision is the atomic change to the system -- you don't
> pick bits of it.
>

Accepting change(s) would be more useful as a merge though.

I see our use case to be like a version control system with 2 branches. The
active (stable) one and the pending (default) one.  The moderator picks what
they want from default to make a new active. A 'merge' revision will be made
and the merged changes will be marked.  So its very much like wikipedia with
a moderator.


>
> You know "Revision" can have state right?
>

Yes but I do not see how that solves anything.


> Generally:


> SELECT FROM changeset
>  JOIN changeobject
>  ORDER BY ...
>
> Specific object (e.g. package with id x):
>
> SELECT FROM changeset
>  JOIN changeobject
>  WHERE changeobject.table = 'package' AND changeobject.id = package.id
>
> * All the unapproved edits (moderation queue).
>
> just add pending filter to above ...
>
> * A point in history/time or a particular revision.
>
> straightforward i think ...
>

You have not dealt with the many to many relationship at all!!!
Simplifying in this way does not help at all.

For the proposals put forward, most of the implementation details have been
worked out.
I need another solid proposal if I am to go about it differently.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20110513/09e81f33/attachment-0001.html>


More information about the ckan-dev mailing list