[open-bibliography] Survey on name identifier systems
Thomas Krichel
krichel at openlib.org
Fri Sep 2 08:06:20 UTC 2011
I was given a survey on name identifier systems. Here are my
answers for AuthorClaim. I think I am trying to press the
case for open data but I may not be able to put it as
effectively as it could be done. Comments?
----------------------------------------------------------------
1. What was the motivation for developing the identifier system?
AuthorClaim is a part of building openly available bibliographic
systems. I have considerable experience with such systems because I
was the principle founder of RePEc. Within RePEc, the RePEc Author
System (RAS) plays an important role. It allows researchers to
claim their papers. I create RAS in 1999. It was the first author
claiming system.
2. Which organisation(s) is (are) responsible for the identifier
system?
Ultimately the Open Library Society is. I am building a small
group of individuals looking after the day-to-day management.
It's part of a broader effort for open bibliographic data.
3. What is the scope of your identifier system, in terms of the type of
people it covers? (For example, does it include: book authors, active
current researchers, formerly active researchers, doctoral students, masters
students etc.)
Anybody can come and register claims to the bibliographic data.
If they have no items to claim, they may find the registration
not very interesting. Getting as much bibliographic data as
possible to appeal to a broad group of authors.
4. How is your system populated with data? (by researchers
themselves/their institutions/funding bodies)
This is a author claiming systems. Authors make claims. But their
records are dwarfed by the amount of bibliographical records.
There are more that 35 million bibliographic records. Over 100
million authorships are up for claim.
5. Who is authorised to make changes to the information in the system?
Authors make changes to their records and bibliographic datasets
contribute bibliographic records.
6. How are identifiers assigned?
When a registrant registers, a new identifier is created.
7. What form does the identifier take?
There two identifiers. The identifier of the person is only used in
internally. It is a long identifier that combines a date in the
person's live with an ascii only name expression. Externally, we
issue a short identifier that starts with p, has two letters and
then a number that is incremented. This short identifier is not,
strictly speaking an identifier for the person but the identifier
of the record in the system that describes the person in the
AuthorClaim system.
8. What information is maintained in the system? (e.g. names,
alternative forms of names, email addresses, dates of birth,
institutional affiliation(s), details of publications, details of
grants received/applied for) Are any standard metadata schemes
supported?
We make names, name variations, dates (but not of birth),
affiliations, publications accepted and refused publicly
available. Email addresses are made publicly available if the
registrant agrees explicitly. Passwords are not made available.
The bulk of the information in the system is bibliographic.
9. With which other systems (if any) does your identifier system
interact?
We work with bibliographic data providers. We are building a profiling
site, http://authorprofile.org, (still in its infancy) and we
use the data is used in ARIW.org. In fact there is a complicated
interaction between ARIW and AuthorClaim.
10. Is the information in the system made available to other services?
Yes. And there are some other services using it, but these are
close relatives of AuthorClaim. Reaching further is still a
challenge.
11. Is there a licence on the data? If so, what is the licence?
CC0. Note the profile data wraps bibliographic data but this
bibliographic data is reduced to element commonly thought to
be in the public domain.
12. If yes, how is this achieved (what interfaces/protocols are used)
and is the system free to access?
There is an ftp site for the AuthorClaim data at ftp://authorclaim.org
13. How is the system funded?
The underlying software was developed with a grant from the open
society institute.
There is no stable funding sources for open bibliographic data. The
system is run by volunteers since 2007. We have a very good track
record of keeping the system running. RePEc runs like that since the
early 90s. You can't make this type of service dependent on external
funding. And I am sure once AuthorClaim becomes more widely adopted
we find volunteers to deal with some mundane tasks and funds to
bring in automation that will further reduce these tasks.
14. Is the system still under active development? If so, what are your
priorities for future enhancements?
Yes, there will be some further work on the system, to make it run
smoother, with more automation and less input from humans. Research
will be conducted in how to optimize the guesses made by computer
learning to available. There are funds for the development of ACIS
and they will be used on adding a host of smaller thing, most
importantly for us, the integration with openID.
15. Do you have any plans for integrating your system with external
initiatives/services such as ORCID, ISNI, Mendeley, Zotero, Academia.edu?
and ResearchGate, and PeerEvaluation and ...
It's a crowded field. AuthorClaim is an initiative that is open.
Many of the commercial initiative collect data but they don't handle
the data further. With ORCID, it is not clear to me where the system
will be heading too, and I am speaking for experience because I am
in their technical commitment. I see INSI as a parallel thing to us.
I have not seen Mendeley and Academia.edu offering lists of papers
deposited, but I should write to Richard to task. I talked to
the Mendeley CEO a while ago.
------------------------------------------------------------------
Cheers,
Thomas Krichel http://openlib.org/home/krichel
http://authorprofile.org/pkr1
skype: thomaskrichel
More information about the open-bibliography
mailing list