[pd-discuss] Activities for Public Domain Day 2013

Wed Jan 2 22:22:14 UTC 2013

On Wed, Jan 2, 2013 at 7:59 AM, Tom Morris <tfmorris at gmail.com> wrote:

> I have a list of almost 800 people reconciled with Freebase which
> represents a combination of the original Freebase lists that I did for
> Adrian plus all the entries for the authorandbookinfo.com that I was
> able to reconcile with Freebase.  There are another 400+ people from
> authoandbookinfo plus crowd sourced contributions which don't
> currently have Freebase or English Wikipedia entries.
>

Nice.

Is there a canonical place to work on shared queries and data?

On Wed, Jan 2, 2013 at 12:00 AM, Samuel Klein <meta.sj at gmail.com> wrote:
> > +1, and happy new year to all.  we could use a query that excludes works
> > that were already PD.   I asked some friends who work a lot with
> freebase if
> > they had ideas for helping further weed out translations.
>
> I've worked with Freebase for 4 years and provided the query Adrian
> used.  The Freebase schema has explicit support for translations, but
> because most source data (ie MARC records) doesn't make it easy to
> identify translations, the schema isn't well populated.
>

Awesome.  So we should fix this for the authors on the list, as part of the
PD celebration.
What's the simplest way to update the Freebase schema as I work?

> I would just make a blanket statement saying "doesn't include
> translations," but if you wanted to make an attempt to identify them
> explicitly, my suggestions would be:
>

I was imagining including general guidance, noting (c) pitfalls and
potential sources of error.
(An author or work could be misidentified, a translation missed, &c.)

> Personally, I wouldn't even try to be authoritative and would instead
> put the onus on the reader to make sure they are in compliance with
> the laws of their jurisdiction.
>

At the end of the day, the extent to which we've contributed to the public
domain is directly tied to how authoritative and comprehensive the
information we provide is.

Tweeting "Works by some authors who died in 1942 are now PD: read these
laws and logs."  is a bit useful to some people.

Publishing a full dataset of  {work, authors, country, date entering PD},
for every combination of {work, country} with a date sometime in 2013,
is more useful to more people.

Publishing such a dataset in a repository that allows public annotation and
updates would be more useful still, and would set a fine standard (as well
as a template for processing such metadata for works before they time out).

> If you did want to attempt to individually identify copyright clear
> volumes, another source of information is the Hathi data since they
> individually clear each volume with human review.
>

Great point.  Do you know how to query that data for the above info?  Do
they offer jurisdiction breakdowns on copyright status, or a window into
their own deliberations?  Something like
"suspected to be PD: yes.  confirmed: no."
"date suspected of becoming PD: March 14, 2013 (EU), Jan 1, 2019 (US), ..."

SJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/pd-discuss/attachments/20130102/f890c20a/attachment.html>