[openbiblio-dev] [bibserver] changing handling of collections from frontend to get them working properly again now that collections are stored as separate objects (937d3e2)

Sun Oct 9 10:31:45 UTC 2011

This is regarding this comment thread:

<https://github.com/okfn/bibserver/commit/937d3e2da406e69335dd6c61eaa2dccc8b8ca83a#commitcomment-634162>

On 6 October 2011 19:00, markmacgillivray
<reply+c-635196-fdd7f383f70a00f747d7f4636419210bc63876ad at reply.github.com>
wrote:
> The collection slug is something that can be known and searched for,
> if it is in the records. Whereas the ID is irrelevant to a user. Why
> should they have to look for a collection ID?

But surely there is a distinction between *presentation* and the *data
model*. It seems clear to me that when referring out to objects from
another object (e.g. referencing a collection from a record) we should
use the *id* of that remote object to make the reference. Of course,
when presenting things to the user in the WUI we may want to show the
collection slug or owner + collection slug. Thus I don't really
understand the reference to "why should they have to look for a
collection ID" :-) ...

> Sure, the delete does only do it by slug, but that is just because
> previously slugs were unique. I am not yet convinced that they should
> not still be unique. However it does not matter, because delete
> functioanlity will be changed to find a collection by ID and delete
> it, then delete any records that claim to be in that collection.

Great. That sounds sensible. Could you also point to the relevant
ticket / user story for these underlying "search for collection (by
some attribute)" and "delete a collection" activities. I think that
would clarify a lot what is going on here :-)

> The value of storing in an index is that the values have meaning and
> can be found. Why have an ID that is not useful in this case when we
> can just have a slug that is unique and that users can search for?

The general reason for having unique and slightly meaningless IDs:
they don't change (while slugs etc may not). Furthermore why should we
force people to have unique slugs across the whole system (it makes
sense per user but not across whole system IMO).

> So whilst we could use the id for foreign keys, why not make the slug
> the ID? Also, why is the collection name stored under the "label" key

This is a definite possibility (ie. making slugs into the ids).
However, I do have doubts about it. Reasons:

* slugs can change
* only slug + username is unique (bad for a primary key)
* slugs are long, not of a fixed length or standard generation and are
designed for url use and some human comprehension

For more about using primary key features see:

<https://github.com/okfn/openspending/issues/96>
<http://www.techrepublic.com/article/the-great-primary-key-debate/1045050>

> now? We did not previously have slugs and labels anyway until you
> added them, and this does now make it harder to operate on collections
> and associated records.

In most systems I have seen us build over the last few years I have
constantly seen the pattern that objects have:

* id: unique, not particularly human-readable, primary key id
* name / slug: a shortish url-usable slug (may or may not be unique)
* label / title: a longer somewhat descriptive title/label which is
usable in normal text and display

I think collection are a standard example of a thing that will need
these attributes.

> As records have a field called collection, so too can collections have
> a field called collection. And this could also be the ID. Then, we

Why? Why would collections belong to another collection? (I'm not
saying this is a bad idea just I'd like to see the underlying user
story for this ...)

> could perform searches on collection=blah, as used to be possible. If
> you think there is value in also having a label, we can still support
> that.

Label can be renamed to title if you want (and if so we should adopt a
standard convention of using title rather than label throughout -- may
be a good idea ...)

Rufus

> On Thu, Oct 6, 2011 at 12:51 PM, Rufus Pollock
> <reply at reply.github.com>
> wrote:
>> Why have we change this from storing the collection id to the collection slug? The collection slug is not unique across collections (only the tuple (userid, slug) is). I am concerned this may cause bugs e.g. the delete of records from existing collection seems to *just* use the slug rather than slug plus userid. In addition good practice is to use the id for "foreign keys". Why violate this for collection references from records? If it is to do with doing display then IMO it would be better to get collection out when we are doing display rather than change what we store ...
>>
>> Aside: would you mind dividing up commits a bit more. This commit not only changes this attribute on records but does stuff with graph.html (is it related). In addition it also seems to have changes that should have been part of a merge (I deleted "pkg" stuff back in my commits that you in theory had merged :-) ). Smaller, more granular commits mean easier understanding :-)
>>
>> --
>> Reply to this email directly or view it on GitHub:
>> https://github.com/okfn/bibserver/commit/937d3e2da406e69335dd6c61eaa2dccc8b8ca83a#commitcomment-634162
>>
>
> --
> Reply to this email directly or view it on GitHub:
> https://github.com/okfn/bibserver/commit/937d3e2da406e69335dd6c61eaa2dccc8b8ca83a#commitcomment-635196
>

-- 
Co-Founder, Open Knowledge Foundation
Promoting Open Knowledge in a Digital Age
http://www.okfn.org/ - http://blog.okfn.org/