[okfn-help] vdm: strange behavior

Martijn Faassen faassen at startifact.com
Fri Aug 20 13:58:01 BST 2010


Hi there,

I think I've found the cause for the misbehavior of get_as_of, which
sometimes returns
the wrong revision.

This is the meat of  get_as_of():

            revision_class = self.__revision_class__
            # TODO: when dealing with multi-col pks will need to update this
            # (or just use continuity)
            out = sess.query(revision_class).join('revision').\
                filter(
                    Revision.timestamp <= revision.timestamp
                ).\
                filter(
                    revision_class.id == self.id
                ).\
                order_by(
                    Revision.timestamp.desc()
                )
            return out.first()

revision_class is the class of the object representing the revision of
my class. Both my class and the revision class are ORM mapped. I'll
call that the 'revision' here, even though it's distinct from the
Revision that can be joined to it.

This line:

                     Revision.timestamp <= revision.timestamp

Makes sure we get only revisions that happened before (or at the same
time) as the revision we're checking.

This then sorts the revisions so that the most recent revision comes first.

                order_by(
                    Revision.timestamp.desc()
                )

and this then takes the most recent revision.

            return out.first()

This:

                filter(
                    revision_class.id == self.id
                ).\

makes sure we actually get the revision of the right object,
identifying it with the 'id' column. (which in fact makes vdm require
the use of a 'id' for the primary key if I'm not mistaken).

But in my tests, the timestamp of the revisions is always identical;
the test runs so fast multiple revisions get made in the same second,
and MySQL at least only stores timestamps with second granularity. As
a result, it can easily happen that instead of
the correct previous revision, we get the revision we just committed,
as they both happened in the same second.

This cannot be resolved by changing this line:

                Revision.timestamp <= revision.timestamp

to this:

                Revision.timestamp < revision.timestamp

as this would cause us to *miss* the correct revision.

Is instead this not possible? After all, we know the revision with that
particular timestamp must be in the database.

                Revision.timestamp == revision.timestamp

of course that wouldn't solve our problem, where there are two
revisions in the exact same timestamp.

Instead, I'm seriously confused why the code does all this timestamp
stuff and ordering anyway, while we have a perfectly good revision id.
When I add this extra filter:

   filter(
      revision_class.revision_id == revision.id)
   ).

everything seems to work reliably. But why isn't the query simply this?

revision_class = self.__revision_class__

return sess.query(revision_class).filter(revision_class.id ==
self.id).filter(revision_class.revision_id == revision_id).first()

this appears to me more reliable and more efficient to boot (assuming
indexes exist),
and a join isn't even necessary! And trying it out it appears to work.

There must be a reason, so please enlighten me.

Regards,

Martijn



More information about the okfn-help mailing list