[wdmmg-discuss] Failed to port datastore to RDF, will go Mongo

Wed Nov 24 13:09:28 UTC 2010

* [2010-11-24 11:56:20 +0100] Friedrich Lindenberg <friedrich at pudo.org> écrit:

] [ On the plus side]...
]
] * Lots of coolness, sucking up to linked data people.

I don't see how these are particularly good things.

] * Further research regarding knowledge representation.

This can never hurt.

] [ On the minus side]...
] 
] * Unstable and outdated technological base. No triplestore I have
]   seen so far seemed on par with MySQL 4. 

It really is abysmal. Virtuoso seems to be the only exception to
this. 

] * No freedom wrt to schema, instead modelling overhead. Spent 30
]   minutes trying to find a predicate for "Euro".

One way or another you are not relieved of the need to model your
data. What you are really seeing here is that the vocabularies for
representing statistical or finanancial data are badly documented, the
best practices aren't clear.

] * Scares off developers. Invested 2 days researching this, which is
]   how long it took me to implement OHs backend the first time
]   around. Project would need to be sustained through linked data grad
]   students. 

This is the big problem. I wouldn't go so far as to say linked data
grad students, but programmers with a background in knowledge
representation and/or logic programming and expert systems. This is
not an exceptionally common skillset.

] * Less flexibility wrt to analytics, querying and aggregation. 
]   SPARQL not so hot.

Stefan Urbanek was telling us at OGDC about some data that he's
working with that has the same shape (multidimensional statistics) and
he's finding dealing it with SQL hard. For this type of dataset a lot
of it should just be pre-computed and put in the store - this is true
with relational databases, triplestores or any nosql thing.

There are plenty of (again badly documented) SPARQL extensions that
are vendor specific to address this.

] * Good chance of chewing up the UI, much harder to implement editing.

Personally I'm not very good at this no matter what the storage...

] I normally enjoy learning new stuff. This is just painful. Most 
] of the above points are probably based on my ignorance, but it
] really shouldn't take a PhD to process some gov spending tables. 
] 
] I'll now start a mongo effort because I really think this should
] go schema-free + I want to get stuff moving. If you can hold off
] loading Uganda and Israel for a week that would of course be very
] cool, we could then try to evaluate how far this went. Progress
] will be at: http://bitbucket.org/pudo/wdmmg-core 

Please just try to make sure that every data point, aggregation,
etc. has a URI so it can be referred to. This is the essential
forward-compatible hook.

Cheers,
-w
-- 
William Waites
http://eris.okfn.org/ww/foaf#i
9C7E F636 52F6 1004 E40A  E565 98E3 BBF3 8320 7664