[data-protocols] SLEEP / The Cut-Out

Mon Jan 28 17:10:43 GMT 2013

Hi all.  I came upon the SLEEP concept:
http://www.dataprotocols.org/en/latest/sleep.html

I have a project which is very similar and might be useful for progressing
the idea: http://thecutout.org/ and the protocol specifically:
http://thecutout.org/protocol.html

The protocol even looks strikingly similar, probably because it's a natural
way to do time-ordered updates.  The basic protocol with The Cut-Out is:

GET /bucket?since=INDEX&collection_id=ABCDEF

This returns a sequence of all updates since INDEX (a counter, not a
timestamp), and collection_id asserts the identity of the bucket (if for
any reason the bucket is overwritten, then INDEX will become meaningless
and the client must do a complete sync).

The result is:

{objects: [
  [INDEX1,
   {type: "object_type",
    id: "unique id for objects of this type",
    data: {unstructured JSON data}
   }
  ], ...]}

And to save updates:

POST /bucket?since=INDEX&collection_id=ABCDEF
[{type: "object_type", id: "some id", data: {...}}, ...]

It's not a peer-to-peer system, the server is considered the canonical
source of truth.  Clients must get all updates and resolve conflicts before
POSTing new items (this could cause some problems if there's a lot of
activity – but we are not assuming that objects are independent, and so
there might be a conflict even among different objects).  There are also
other details to the protocol, handling these conflicts – I haven't used
REST principles and instead try to make every request advance the sync
process.  So for example if you POST updates and get a conflict then it'll
return the objects you haven't seen, because you'll need those to complete
a second POST.  I think there's a wide variety of use cases where this kind
of non-REST approach will be more efficient.  Individual objects are also
not URL-addressable, though there is support for lazy fetching of large
objects: http://thecutout.org/protocol.html#blobs

Then I suppose I should note "why" I wrote The Cut-Out.  Because I could,
of course.  But more specifically I was working on a project that involved
syncing data across clients, and while the project got thrown away I had my
mind all up in the concept, and I started seeing more and more reasons for
time-ordered data.  Also I saw a specific use case of stand-alone HTML
applications with no server, which can be entirely functional but lack
backups and data synchronization across devices.  That's what The Cut-Out
actually provides – both a server and client that handle browser-based data
persistence, along with authentication using Persona (
https://developer.mozilla.org/en-US/docs/persona).  The server is also
written with this particular use case in mind, emphasizing low overhead for
individual buckets, and a write-heavy workload (there's many cases where
there will be writing and never reading).  The server is arguably
over-optimized ;)  I

There are some private parts of the server protocol that allow servers to
balance and move buckets around, similar I think to aspects of the CouchDB
protocol, but they are very specific to the implementation.  The general
concept is not peer-to-peer, meaning that a central canonical server is
essential to how the syncing works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-protocols/attachments/20130128/a37d7cd8/attachment.htm>