[okfn-discuss] We Need Distributed Revision/Version Control for Data

Benoit Boissinot bboissin at gmail.com
Mon Jul 12 20:19:07 UTC 2010


On Mon, Jul 12, 2010 at 9:50 PM, Patrick Anderson <agnucius at gmail.com> wrote:
> Peter Murray-Rust wrote:
>> I have used normal SCM to store some of my data.
>> My problem is that often updates takes lots of time even in very little has changed.
>
> I wonder if http://git-SCM.com solves any of these performance issues.
>
> It is claimed to be very efficient - even with binary data.
>
>From the post:
Our approach was based heavily on the mercurial/git conceptual model
and used as data structure the natural one implied by the domain model
(~ database rows but not quite) — in essence we dump to json for each
field and then do diffs on the json.

Git (and Hg) require the data to fit in memory (and usually more than
that). It can be a problem with large datasets.

Benoit




More information about the okfn-discuss mailing list