[openbiblio-dev] Best practice for name format in BibJSON

Mark MacGillivray mark at odaesa.com
Sat Feb 25 18:42:15 UTC 2012

On Sat, Feb 25, 2012 at 2:33 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:
> On 2/24/12 6:00 PM, Mark MacGillivray wrote:
>> Clarifying the "id" field in author object:
>> On Fri, Feb 24, 2012 at 6:05 PM, Jim Pitman<pitman at stat.berkeley.edu>
>>  wrote:
>>>> Also, I notice that there is an "id:" element in the person name area. I
>>>> presume that this could be used for ORCID or researcherID, etc. Will the
>>>> type of ID be clear from the text? I don't know how people generally use
>>>> these, and if they are full URNs or URIs or not.
>> Karen - this is not quite right; the ID field in the author record
>> here is an ID for that record.
> This confuses me a bit. If this is an internal bibserver ID for the author
> record, how does the author record get that ID? If the ID is supplied by
> someone submitting data to bibjson, what guarantees that it is unique?

Internal IDs are separate from this ID. This ID - e.g. the key "id" in
an author object - would be some ID supplied by the person uploading
the data, if they desire. I think originally it was Jim that wanted to
do this. It should be a unique identifier for that author within the
collection, but this is not enforced when the data is uploaded.

>> It could be some other sort of ID in
>> another system, but this one is specifically for use in identifying
>> the record in question - so if you start to have multiple authors
>> appearing in different records but actually they are the same author,
>> you can assign them the same ID.
> Who is "you" in this case? And does this mean that you will have more than
> one author record with the same ID? If so, how do you know that they
> represent the same author rather than the same ID used more than once?

"you" in this case is whoever is uploading this dataset, and sticking
"id" keys in their author objects.

By having this "id" key in an author object, it provides an easy way
to upload say 10 records that represent articles, and each record has
an "author" key that points to the usual list of author objects; each
of those author objects could have a "name" and an "id". Then, along
with the 10 "type":"book" records, 5 "type":"author" records could
also be in there, and those 5 records could be matched on "id" keys
with the values in the "article" records.

So, that is the concept behind the "id" key in the author object;
records themselves also have "id" keys, which again are provided by
the end user and should be unique within the collection, but it is not
enforced. Actually, at the moment I think these are called "cid" which
was an evolution of "citekey", but current consensus is to move away
from "cid" and just have generic "id".

Internally, everything is assigned a UUID which, at the moment, is
stored in a record under the "id" key - but as above, if the
user-provided IDs are going to be under the "id" key, then the UUIDs
we attach will probably go under the "_id" key - where "_" prefixed
keys are being used to represent internal information.

The distinction between the "id" key and the "identifier" key is that
"id" is some ID for a record as provided by the end user, whereas the
"identifier" key points to the list of identifier objects that are
also relevant to whatever object they appear in - such as
url","description":"this is a DOI. it is useful"},...]. Of course, one
of the "identifier" IDs may well be appropriated by the end user as
their ID for their objects, which is fine (but not enforced).


> kc
>> Identifiers from external systems are
>> handled differently - described below.
>>> Brings us to the basic issue of regularizing ids for whatever entities:
>>> I think this needs a separate discussion and some collab doc space to
>>> develop
>>> best practices for ids in BibJSON. I'd be glad to start a Google Doc on
>>> this. Any volunteers to work on it?
>> The identifiers in bibjson records are already described - on
>> bibjson.org there is a bit about the "identifier" key. This key can be
>> used within the top level of a record - e.g. listing identifiers about
>> whatever the record is about - or it can be used within an author
>> record (or anywhere else) - so we can list in here the various
>> identifiers that an author may have on other systems. "identifier" is
>> a list of objects, and the fields so far described at bibjson.org
>> include "id", "type" and "url" - though of course, you can use others
>> that you find useful. But by providing the ID, the type (which is
>> obvious in some cases e.g. DOI and in others can be made up and
>> consensus converged upon), and a relevant URL if there is one, we can
>> make it clear what the identifier is for. Other info about it could
>> easily be added in a "description" key.
>> Mark
>>> --Jim
>>> ----------------------------------------------
>>> Jim Pitman
>>> Professor of Statistics and Mathematics
>>> University of California
>>> 367 Evans Hall # 3860
>>> Berkeley, CA 94720-3860
>>> ph: 510-642-9970  fax: 510-642-7892
>>> e-mail: pitman at stat.berkeley.edu
>>> URL: http://www.stat.berkeley.edu/users/pitman
>>> _______________________________________________
>>> openbiblio-dev mailing list
>>> openbiblio-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
> --
> Karen Coyle
> kcoyle at kcoyle.net http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
> _______________________________________________
> openbiblio-dev mailing list
> openbiblio-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/openbiblio-dev

More information about the openbiblio-dev mailing list