[openbiblio-dev] [pd-discuss] Bibliographic Metadata Guide

Karen Coyle kcoyle at kcoyle.net
Mon Oct 10 18:03:49 UTC 2011


Attached is a fairly short text file with a start at a set of goals  
and data elements, plus a number of questions (there are always  
questions :-)). This is not supposed to be complete -- it's a starting  
point for discussion. Note that we haven't mentioned any particular  
metadata schema or serialization -- yet. I'm thinking that we should  
consider the needs first, and then look for possible solutions.

Primavera, the main question for you is: does this capture the general  
metadata goals for your project? Is this close to what you were  
thinking?

I'm also hoping that we can tackle this incrementally, perhaps dealing  
with the primary data elements first, then adding enhancements as we  
find folks who have that data to provide.

kc

Quoting Primavera De Filippi <primavera.defilippi at okfn.org>:

> Hi all,
> just a quick reminder that we'll be having a Bibliographic Metadata
> Guide skype call today at 16:00 GMT + 1
>
> If you are interested and available for the call, please put your
> skype id on the following pad: http://openbiblio.okfnpad.org/bibguide
>
> Thanks !
>
>
>
> On Mon, Oct 3, 2011 at 5:33 PM, Karen Coyle <kcoyle at kcoyle.net> wrote:
>> For those of us on the West Coast, that is 7 a.m. I'm used to that, but
>> can't speak for others.
>>
>> kc
>>
>> Quoting Primavera De Filippi <primavera.defilippi at okfn.org>:
>>
>>> ok, what about 15:00 GMT then?
>>> I hope this can suits the americans too !
>>>
>>>
>>>
>>> On Mon, Oct 3, 2011 at 2:49 PM, William Waites <ww at styx.org> wrote:
>>>>
>>>> Oh GMT. This time of year we are GMT +1 i believe. Which means that's too
>>>> late... I can do afternoon ... But not after "standard" business hours...
>>>>
>>>> Mark MacGillivray <mark at odaesa.com> a écrit :
>>>>
>>>>> Hi primavera,
>>>>>
>>>>> I can make that time.
>>>>>
>>>>> On Mon, Oct 3, 2011 at 1:05 PM, Primavera De Filippi
>>>>> <primavera.defilippi at okfn.org> wrote:
>>>>>>
>>>>>> Hi William,
>>>>>> we would actually love you to attend, so maybe let's try and do it on
>>>>>> Friday 7th October, 16:00 GMT ?
>>>>>> Everyone, please confirm your availability, thanks !
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Sep 28, 2011 at 8:11 PM, William Waites <ww at styx.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> "pdp" == Primavera De Filippi <primavera.defilippi at okfn.org>
>>>>>>>>>>>> writes:
>>>>>>>
>>>>>>>    pdp> Ok, given that many people cannot make it tomorrow, what about
>>>>>>>    pdp> next week, Thursday 6th October, 16:00 GMT ?
>>>>>>>
>>>>>>> Primavera, I can almost never make Thursdays.  If you would like that
>>>>>>> I
>>>>>>> attend (and I would like to) MWF work for me, but Wednesday earler
>>>>>>> than
>>>>>>> 16h00 as that clases with another working group.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> -w
>>>>>>> --
>>>>>>> William Waites                <mailto:ww at styx.org>
>>>>>>> http://river.styx.org/ww/        <sip:ww at styx.org>
>>>>>>> F4B3 39BF E775 CF42 0BAB  3DF0 BE40 A6DF B06F FD45
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> openbiblio-dev mailing list
>>>>>> openbiblio-dev at lists.okfn.org
>>>>>> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>>>>>>
>>>>
>>>
>>> _______________________________________________
>>> openbiblio-dev mailing list
>>> openbiblio-dev at lists.okfn.org
>>> http://lists.okfn.org/mailman/listinfo/openbiblio-dev
>>>
>>
>>
>>
>> --
>> Karen Coyle
>> kcoyle at kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>>
>



-- 
Karen Coyle
kcoyle at kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
-------------- next part --------------
Metadata for Primavera?s project
I. Goals
1. Make it easier to harvest and process bibliographic metadata from a variety of sources.
2. Define a small set of metadata options (data elements and serializations) that can be used/adopted by data providers.
3. Allow for differences in granularity of the data, but provide best practices that many data providers should be able to achieve.
4. Identify key elements needed for discovery, identification, location, and deduplication.


II. Scope
Initial scope is textual documents:
1. Books and their parts
2. Journal articles
3. Conference proceedings
4. Online texts (e.g. Wikipedia articles, arXiv eprints, technical reports, working papers)


III. Minimum data elements
a. Books
1. creator(s) (one minimum) 
2. title of the book
3. date
4. ISBN 
5. URL if online
b. Book chapters
1. creator(s) (one minimum)
2. title of the chapter
3. start page or start/end pages of the chapter
4. title of the book
5. date of the book
6. ISBN
7. URL if online
c. Journal articles
1. creator(s) (one minimum)
2. title
3. ISSN or full journal name
4. year
5. enumeration: volume, number, start page (as appropriate)/substitute date if no other issue enumeration is available . (should minimum require these to be parsed?) 
6. URL if online
7. DOI if available (should we ask for any other identifiers?)


d. Online texts
1. creator(s) (one minimum)
2. title 
3. URL


IV. Full data elements (if available; everything beyond minimum is optional)
a. Books
1. creator(s) 
2. title
3. date
4. publisher
5. pagination or number of pages
6. place of publication
7. URL if online
8. ISBN
9. identifier(s)
10. type type [We?ll need a list. e.g. bibliography, encyclopedia, ... ]
11. links
b. Book chapters
1. creator(s) (one minimum)
2. title of the chapter
3. start/end pages of the chapter
4. title of the book
5. date of the book
6. publisher
7. place of publication
8. ISBN
9. identifier(s)
10. URL if online
11. type [ needed? very difficult to provide a list]
12. links
c. Journal articles
1. creator(s)
2. title
3. ISSN
4. eISSN
5. full journal name
6. enumeration: volume, number, start page (as appropriate)/end page, date of publication
7. URL if online
8. type [May need a list. e.g.  research, expository, survey,review, abstract, ?. such classification is sometimes provided by publishers]
9. identifiers
10. links
d. Online texts
1. creator(s)
2. title 
3. URL
4. date accessed
5. date created
6. format: html/pdf/etc. [we?ll need a short list to choose from]
7. type [ We?ll need a list. e.g. eprint, techreport, encyclopedia_entry, obituary, news_article, review, abstact, ?. ]
8. links


V. Enhanced access
(this will be things like keywords, abstracts, tables of contents, links to related resources, etc.)
To be decided


Things to Discuss
1. Enumeration: will we need separate elements for volume, issue, etc.? What is the use case for these data elements? (Note, any articles or journals older than about 10 years will not have DOIs or SICIs.)
2. Resource identifiers: Some identifiers, like DOI, are self-contained (e.g. in URI format). Many are not. We probably do not want to have dozens of identifier fields, so we need a format for the data that can go into a single identifier field, e.g. PMID for pubMed items: PMID:12345.   Do we need a list of recommended identifiers?
3. Entity identifiers: do we want to accommodate identifiers for persons, places, and other entities? 
4. Creator types: again, a short list (author, editor,reviewer, ? ) If unknown, default can be ?creator?
5. Update of bibliographic data: we need some idea of how updates will happen before we can discuss identification and versioning of the metadata itself.
6. Links.  For each link, need at a minimum the url, and preferably text and indication of nature of relation of the link.  Close here to  LD.








[a]


[b]
[a]kcoylenet:
JIm, don't add too much discussion to the document. we can put that in the email or when we talk to primavera.
[b]kcoylenet:
This isn't the place for it


More information about the openbiblio-dev mailing list