[okfn-labs] Calendar fields (date/time)

Alioune Dia dia.aliounes at gmail.com
Thu Feb 20 12:19:57 UTC 2014


Hi

I think we can validate something like '2000' as a date as well as
machine/Human readable

>>> datetime.datetime.strptime("8900", "%y")


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python26\lib\_strptime.py", line 328, in _strptime
    data_string[found.end():])
ValueError: unconverted data remains: 00


Python will raise error besause the format specified is  '%y'  not '%Y'

>>>  Also  something like that will pass

>>> datetime.datetime.strptime("8900", "%Y")
datetime.datetime(8900, 1, 1, 0, 0)


So  My question is what does it mean 'format : YYYYY' in  datapackages ? ,
is it a
format that will be used to make/conversion/validatation  in   programming
langage?



2014-02-20 10:26 GMT+01:00 Michael Bauer <michael.bauer at okfn.org>:

> Stefan,
>
> On Wed, Feb 19, 2014 at 11:58:03PM +0100, Stefan Urbanek wrote:
> > PROBLEM
> >
> > In the bond-yields-uk-10y dataset[1] there is:
> >
> >           {
> >             "id": "Year",
> >             "type": "date",
> >             "format": "yyyy"
> >           },
> >
> > What is the problem? The field in the data file contains an integer. The
> metadata says it is "date".
>
> I've stumbled across this as well.
>
> > DESCRIPTION
> >
> > From parsing and validating point of view, the field metadata as
> specified above are correct. We (humans) know that it is date (in fact,
> part of a date). We also might have a validator (of one type) for the
> format.
> >
> > From data processing it is not easy to handle this case:
> >
> > 1. there is no standard way of specifying not-specific date (at higher
> granularity than a day). There are many ways of doing it, but that's
> another topic.
> >
> > 2. the metadata says that it is a date, but it can not be converted to a
> valid system date datatype - neither in Python nor in SQL and I doubt about
> other environments as well. Reason is the point number 1.
> >
> > 3. what format is the format? ... but that's another issue, I'm not
> going into more depth here.
> >
> > From machine process-ability it is just an integer or a string. As
> '2012-04' is just a string, not april. `2012` is not `January 1st 2012`,
> `2012-04` is not `April 1st 2012`. There are datasets with mixed date
> granularity: month level and daily levels for example - all in the same
> dataset, with same fields. If we just converted every date-like string to a
> full date, we would not be able to distinguish "whole year 2012" from
> "April 2012" from "April 1st 2012".
>
> Agree! I've ran into this a couple of times. In that case I'd probably not
> even use a date type - but a simple integer for the year. and month and
> have different columns for them.
>
> > Example practice from data warehouses: the quite common way of modelling
> date dimension is to use integer to specify a date: `YYYYMMDD`. If the
> `MMDD` part is 0000 as in "20140000" then the value corresponds either to
> whole year or unspecified part of the year (we can't tell without more
> information). Similar for `DD` == `00`. The values for "unspecified" might
> also be `9999` instead of `0000` for sorting purposes, as in "20149999".
>
> Michael
>
> --
> Data Diva | skype: mihi_tr | @mihi_tr
> The Open Knowledge Foundation | School of Data
> http://okfn.org | http://schoolofdata.org
> GPG/PGP key: http://tentacleriot.eu/mihi.asc
> _______________________________________________
> okfn-labs mailing list
> okfn-labs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/okfn-labs
> Unsubscribe: https://lists.okfn.org/mailman/options/okfn-labs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/okfn-labs/attachments/20140220/f6b32beb/attachment-0004.html>


More information about the okfn-labs mailing list