[ckan-dev] dataset urls

Rufus Pollock rufus.pollock at okfn.org
Wed Apr 25 00:24:09 UTC 2012


On 25 April 2012 01:07, David Raznick <kindly at gmail.com> wrote:
>
>
>> We are talking about comment links atm.
>>
>> Re name change you get broken
>> links on github if you rename your repo.
>>
>> We could also implement a
>> simple redirector by looking up in the dataset revision table for old
>> names :-)
>
>
> We considered this but it would not work unless we were sure that all names
> are unique forever.

But why does this need to be perfect? If someone renames and then some
other dataset replaces it fine - o/w this would work :-)

I'm not sure we are debating same thing here. dataset uuid can be used
for things that absolutely need to permanent forever (e.g. rdf uris,
permanent identifiers for syncing). But for other stuff it's not the
end of the world if something breaks (if that is rare and people are
warned of risk)

>>
>> > changing *HARD* because the last thing we want to do is confront users
>> > with
>> > with more choices then necessary.  They should not be forced to think
>>
>> Why shouldn't it be like github repos. You can change but you are
>> warned about problems. Pick a good name.
>
>
> If we cared that much about the name we would not sluggify the title and
> force people to make good ones.  Github forces you to do this.

We used to do this. I have pushed several times for making dataset
name sluggification better (remove article, warning people if long
...). Github btw now has something similar to what we do.

>> through the consequences of their actions and read some blurb as why its
>> bad
>> if we can make it avoidable.
>
>> Understood. That said I frequently type in the names of familiar
>> datasets (but i may be unusual). That's never possible once we have
>> somewhat random id in there. But that's then a question of usage. I
>> think DataHub at least is more like GitHub (or Twitter) in that
>> regard: I care about this entities name a lot (compared to say
>> StackOverflow where I always arrive via google or similar).
>>
> I think that the relevance of the name has much less consequence to us then
> github but more then stackoverflow.  I am happy to keep the ability to

OK, interesting. I don't see that way so much.

> reference by name only in the url, but not give that out when systematically
> creating a permanent links, like in this case.

To repeat the disqus system will reference the permanent identifier
and the disqus_url is, IRRC, a convenience (used from recent comments
etc).

> This can include things like activity streams, social stuff, apis to update
> qa information, feeds, and anywhere we give out urls that we expect other
> services to use. We could use just uuids for these but some would also
> benefit from also being less ugly.

There are two distinct discussions:

* Do we want uuids exposed for datasets (by default) in the UI. I'm
saying no, you're saying yes :-)

* Do we want uuids exposed for datasets elsewhere (e.g. in activity
streams, qa etc). Probable agreement ...
  * I'm not quite sure what this means. Internally we ref the dataset
object. Hence we can always change the url link at least in our system
as this updates. For some things like RSS feeds given out to others
this is more problematic (and i'd be happy with uuid/{friendly-name}

* All agree: If we have to use uuids we can make them less ugly (e.g.
by appending title). I'm concerned about shortening that risks
collisions because you end up back where you started ...

Rufus




More information about the ckan-dev mailing list