[Open-companies] OGDCamp Organisational Identifiers Workshop

Tim Davies tim at practicalparticipation.co.uk
Wed Oct 26 12:07:48 UTC 2011

Dear all,

Thank you for taking part in, or expressing an interest in, the
Organisational Identifiers Satellite workshop of Open Government Data Camp
in Warsaw on Sunday.

Summary notes from the session are now available at:
http://wiki.okfn.org/OGDCamp_2011_Organizational_Identifiers_Workshop and
are pasted below.

You will also find here links to the Etherpads where notes were taken during
the session.

Participants are welcome to amend the notes on the Wiki if they do not
accurately reflect discussions.

To take forward some of the next steps, it was proposed in the short-term we
might use the Open Companies mailing list at
open-companies at lists.okfn.orgalthough recognising that the
Organisational Identifiers discussion goes
beyond simply identifying companies.

All best wishes



Many projects have a need for re-usable organisational identifiers which can
be used to map together data about organisations from different sources, and
to consistently identify an organisation within a dataset. This workshop, a
satellite event of the 2011 Open Knowledge Foundation Open Government Data
Camp, explored different existing efforts in the organisational identifier
space, and identified a number of key principles and proposals for action to
develop shared approaches, standards and infrastructures for organisational
identifier schemes.

Tim Davies (Practical Participation / Aid Info, Facilitator), Chris Taggart
(Open Corporates), Ramine Tianati (Southampton University), Rolf Kleef (Open
for Change), Alvaro Graves (Tetherless World Constellation / LOGD Project),
John Wonderlich (Sunlight Foundation), Kaitlin Lee (Sunlight Foundation),
Rufus Pollock (Open Knowledge Foundation), Freiderich Lindenberg (Open
Knowledge Foundation / Open Spending), Ruth Del Campo (New York Law School),
elf Pavlic, Derota, and by Skype for the first session: Bill Anderson
(Development Initiatives / IATI), Dinesh Venkateswaran (Techsoup Global),
John Hecklinger (Global Giving), James Robertson (Alterseed)
]Key principles

The workshop identified a number of key principles for developing shared
standards around organisational identifiers:

   - *Use existing identifiers whenever they are available*
   - *New identifiers should only be created as a last resort*. The
   standards should build on existing IDs issued to organisations. With the
   existing ID of an organisation it should be possible to work out it’s ID
   under the shared standard.
   - *Develop mapping and resolution services to connect IDs*, rather than
   proposing adoption of unique IDs. To address the challenge of two
   identifiers picking out the same organisation we will propose approaches to
   map the relationship of identifiers, and resolve one identifier to another.
   - *Focus on simple solutions*: Look for the minimal viable solution that
   will scale in future.
   - *Use distributed approaches wherever possible*: Avoiding the
   introduction of centralized identifiers.

]What is needed from an organizational identifier: use cases and

There are many different use-cases for organisational identifiers, with
overlapping but different sets of requirements. Use cases discussed in the
workshop included:

(1) *Definitively identifying legal entities*

Identifiers should relate directly to the instruments that bring an entity
into being: i.e. company registration numbers . It would be helpful to
record the relationships between entities, and capture details of their
change over time (e.g. ‘X is member of group Y’, ‘X is owned by Y’, ‘X
merged with Y’)

(2) *Identifying conceptual entities*

Although we might talk about ‘Microsoft’, there is no single legal entity
which is ‘Microsoft’. Finding ways to relate identifiers to common place
conceptual entities is useful in a number of cases. Answering the question
‘What is a company?’, or ‘What is a charity?’ turns out to be fairly complex
when you are working across borders.

(3) *Identifying national, international and super-national organisations*

Some schemes only need to identify organisations within a specific
jurisdiction, others need to identify organisations across borders, and even
to identify international institutions which have no direct country-level
registration or identifiers.

(4) *Identifying organisations of a particular status*

For example, a scheme may only need to cover charities. The nature of a
Charity varies between jurisdictions. In some, an association or company may
exist as an informal or legal entity prior to registering as a charity (so
charity is a status of an existing organisation), in others, Charity
Registration may create a new organisation.

(5) *Using legacy identifiers* A system may have some internal set of
identifiers which are not mapped to shared organisational identifiers, but
which are available in an existing system to expose. It would be useful to
find ways to map these onto existing shared identifiers.

(6) *Providing identifiers where none exist, or none pick-out the required

Some organisations which need to be identified do not have an existing
identifiers. For example: non-constituted associations in the UK charities
in some countries where not registration scheme is available Sometimes the
scope of existing organisational identifiers does not match the scope
required. For example: an organisations identifier may not clearly
communicate the organisations status (e.g. charity) because of limitations
in the registrations systems in operation in that organisations country.
]Existing Schemes

A number of existing proposals or schemes for organisational identifiers
]IATI Organisation Standard

The draft Organisational Standard of the International Aid Transparency
Initiative is currently based on either using an organisational ID from the
OECD Development Assistance Committee Code List (which covers a number of
international organisations not otherwise registered, and a number of donor
government departments), or an identifier of the form:

IDENTIFICATION SCHEME is a code to identify the registration scheme in use,
agreed with the IATI Secretariat or Technical Advisory Group, and ID is the
identifier from that scheme. So, for example, The US based William and Flora
Hewlett Foundation can be identified by:


Where EIN is a unique US identification scheme used by charities and
companies. And UK charity development initiatives can be identified by:


No mechanism is put forward in the draft IATI Organisation Identifier
Standard for resolving identifiers to information about the given
organisation, or for the identifier to prefer if an organisation has more
than one identifier.

The IATI Organisation Identifier is also used as the basis for IATI Activity
Identifiers, which take the form: ORGANISATION ID-ACTIVITY ID For example,
an activity of Development Initiatives could have the ID:
]Open Corporates URIs

Open Corporates is compiling a database of corporate legal entities across
jurisdictions, drawing on publicly available company data, either as open
data, or scraped from web sites.

Open Corporates exposes data at http://opencorporates.com/companies/ using
the format: http://opencorporates.com/companies/COUNTRY OR
picks out a country or a state/district if company registration in a
particular country is handled at the sub-national level. For example, a US
company registered in the District of Columbia with the registration ID
L10053 will have the URL:

http://opencorporates.com/companies/us_dc/L10053 This URL returns human
readable data about the company. Appending .xml, .json or .rdf to this URL
will return machine readable data. Open Corporates is specifically concerned
with providing identifiers for companies.
]Identity Hub

The Linking Open Government Data (LOGD) project at Tetherless World
Consortium have put forward a series of design principles for URIs for US
Linked Government Data based on the URI template:

 'http://' BASE '/' 'id' '/' ORG '/' CATEGORY ( '/' TOKEN )

Allowing, for example, URIs of the form:
BASE can be replaced with any service which ca resolve the required URI and
provide data about it.
]Global Giving Collaboration

Global Giving and other partners are working on the development of an
identifier scheme particularly for use case 6: providing identifiers where
none exist, or the scope of existing identifiers is inappropriate.

This is likely to involve facilitating registration of new identifiers for
some organisations.
]Wider Initiatives

There are a number of other actors, initiatives and other ongoing projects
in the organisational identifier space. The workshop identified the

   - ORGPedia is focussed on identifying US companies and relating their
   different identifiers.
   - The European Union has announced plans to work on open and
   interoperable data from company registers.
   - Dun & Bradstreet provide the DUNS number to organisations that
   register. DUNS number data is proprietary. DUNS numbers can refer to a legal
   entity, divisions of that entity and individual branches.
   - Bloomberg Number created by Bloomberg.

]Working Groups

The afternoon of the workshop involved focussed work on three topics:

   - *The architecture of an organisation standard* - Identifying key
   components of an identifier standard, and mechanisms for resolving
   identifiers to information and data on the entity identified.
   - '*Common terms and descriptions of organisation relationships*
   - *Identifying public bodies* - Public bodies tend not to be registered
   like companies or charities are. We need a scheme to identify public bodies.


This working group identified a possible architecture building upon the IATI
Identifier model but suggesting:
]Allowing for multiple namespaces

Such that an identifier could be of the form:


Where US is the Country (top-level namespace), NY is a second-level
namespace identifying State, DMV is a third-level namespace identifying the
registration or identity scheme in use, AA identifies a set of categories
within this identification scheme, and 12345 is the relevant identification

(Note: following reflection, it might be more appropriate to reverse the
namespace so that identification scheme type is the top-level category, e.g.
COH-GB-12345 for UK company, as this makes it easier to declare and use
resolution services)
]Providing a light-weight ‘authority list’ of namespaces

Namespaces would generally be hierarchical under countries, but a number of
top-level namespaces would be provided, including OECD-DAC- and other
relevant general identification schemes. A central point is needed to
provide an authoritative list of namespaces.

In the medium term the authority list will need some governance structure,
with a process for agreeing which namespaces are added, and registering
resolution services against namespaces. This might comprise of a small
virtual committee of interested parties, working via a consensus based
e-mail list to respond to proposals for new namespaces.

Whilst the authority list could follow a DNS model and delegate control over
a set of top-level namespaces to sub-authorities, this was deemed too
complicated for initial implementation.

In the short term, a simple file will suffice listing:

   - Namespace (e.g. GB, or GB-COH)
   - Identifier Type - are identifiers in this namespace ‘registration IDs’
   (e.g. company numbers, and as such authoritative identifiers of legal
   entities); or are these identifiers of another type (a minimal list of types
   would need to be identified)
   - Resolution services - a list of URI bases to which the ID portion of
   the identifier could be appended to fetch data about this organisation.

]Providing a resolution service standard

Anyone should be able to declare a service to resolve identifiers in a
particular namespace.

For example, Open Corporates may declare that it will resolve any
identifiers for companies namespaces, and provide a base URI of

An application that has the ID GB-COH-06368740 could then look up in the
authority list that Open Corporates provides a resolver, and could append
the ID to the opencorporates.com base URI to fetch back data on the
organisation in question.

Resolution services should return a standard set of data, including,
wherever possible, details of related organisations.

The resolution standard should include provision of an ‘at_time’ parameter,
so that if a resolution service is able to provide data for past periods
this can be requested. For example a consuming application may have data
recording a transaction with a company in 2005. If they request data on that
company from a resolution service, with an ?at_time=2005-01-01 (for e.g.)
parameter then the service should return the details of the company as of
that time (if known). With access to the authority list, and an existing
organisational identifier from one of the namespaces registered in the
authority list, anyone should be able to construct a standardised
organisational identifier. If a resolution service is available for the
namespace in question it should be possible to look up details on that
organisation, which will hopefully including relationships of this
identifier to other relevant identifiers.
]Other points from the group included

   - The need for a governance structure for the authority list as an
   ongoing role, and a one-off requirement for work to agree the standards for
   interchange of data.
   - Considering / rather than - as the separator & providing standard to
   escape the separator There was no conclusive view on this. Whichever is
   adopted, some method for escaping - or / in an identifier is required. (e.g.
   // = /).

]Common terms

The common terms group worked on a preliminary typology of relationships
between organisations that could provide the basis for some standard sets of
information that resolution services should attempt to provide when they
can. Preliminary Typology of Relations

   - "Persistent relations"
      - Organsational
         - is member of (association/group/cabal)
            - is affiliated to
         - is organisational unit (department, etc) of
         - is shareholder of
            - is owner of (special case of above? wholly owner of
         - "Contractual"
         - has contract with
            - owes money to (long-term debt)
            - is supplier to
            - licenses to
         - takes legal action against
         - donates to

Note: "supplier to", "donates to" indicating there are multiple
transactions, either or not available as separate facts.

   - "Temporal relations"
      - Split into
      - Spin-off off
      - Merger
      - Acquisition

Temporal relationships also need to be captured. SPLIT INTO A split into B.
C, ... (A ceases to exist, and B, C, ... start to exist)


A created spin-off B (A continues to exist, B starts to exist)

A, B, ... merged into C (A, B, ... cease to exist, C starts to exist)

A acquires B (and moves its assets into A, B ceases to exist

]Identifying Public Bodies

Schemes like the OECD DAC Code List only include a small selection of public
bodies, and tend to only include public bodies involved in Aid Donation.
There are few definitive national lists of public bodies. Finding a clear
definition of what constitutes a public body is also complicated: A public
body could be defined as:

   - *A body that defined as a public body by law* - although the legal
   definition of many public bodies is scattered across legal instruments and
   no clear lists exist in most jurisdictions. It the becomes important to look
   for other sources of relevant lists. An institution in receipt of public
   budgets - but this may include private-public partnerships etc.
   - *An institution subject to Freedom of Information laws* - this allows a
   public bodies list to draw on work done by Freedom of Information portals
   such as WhatDoTheyKnow.com, but (a) is limited to companies with active FOI
   laws and campaigns who have compiled relevant lists; and (b) may not cover
   all relevant public bodies depending on the scope of particular FOI laws.
   - *Has a government website*
   - *The subject of a COFOG classification* - COFOG is the UN standard for
   Classifications of the Functions of Government.

Public bodies are also liable to change over time: as departments are
merged, renamed and restructured, and administrative boundaries reshaped.
Identifying when public bodies should get new IDs, or when their old IDs
should be retained requires careful attention. Two possible proposals have
been put forward for developing identifier sets for public bodies:

   - *Using COFOG to pick out functions of government at particular levels
   of administrative geography'*. For example, the code GB-COFOG-NN could be
   used to pick out the development function of the national UK government
   (COFOG code NN), which could be resolved to a particular departmental
   identifier if one were available. GB-OXF-COFOG-NN could be used to pick out
   the education department of Oxford County Council (where OXF is a code for
   Oxford County).
   - Providing identifiers at publicbodies.org Building a list of
   public-bodies on a country-by-country basis drawing on the best available
   lists in any country.

Neither of these proposals fully resolve the problem of identifying public
bodies and further work may be needed on this.
]Next Steps

There are a number of next-steps from the workshop:

   - *Continue collaboration and dialogue to create a draft Organisational
   ID standard and key terms for data exchange* - This could take place
   jointly between the OKFN ‘Open Companies’ and ‘Open Development’ working
      - *Consultation with the IATI Secretariat and Technical Advisory Group
      on feasibility to adjustments to the IATI standard is required'*.
      - A timetable for a draft should be set.
   - *Creating working demonstrations of an authority list and resolution
      - Using minimal technologies such as Google Documents to create an
      initial authority list which could be consumed as CSV or XMl
      - Develop demonstration of resolving organisational IDs via a
      resolution service - Resolution services could/should build on the Google
      Refine API This task needs to be adopted by someone.
   - *Draft Proposal and Terms of Reference for an Organisational
   Identifiers Governance Group*
      - Create a circulate proposal for a governance group to oversee the
      standard and maintain the authority list.
      - Invite key parters to participate in establishing the group.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/open-companies/attachments/20111026/01346d9a/attachment.html>

More information about the open-companies mailing list