[okfn-labs] New library: Tabular Validator
paulywalsh at gmail.com
Thu Feb 19 14:53:53 UTC 2015
I want to announce a new library I’ve been working on for OK.
Tabular Validator (https://github.com/okfn/tabular-validator) is a Python package for validating tabular data through a processing pipeline. It is alpha software.
It is built by Open Knowledge, with funding from the Open Data User Group (https://www.gov.uk/government/groups/open-data-user-group).
Applications range from simple validation checks on CSV files, to integration with a larger ETL pipeline.
The codebase currently ships with two validators that can be used in a pipeline:
• The StructureValidator checks for common structural errors
• The SchemaValidator checks for conformance to a JSON Table Schema.
There is a hook to add custom validators, and there are plans to include more validators in the core library.
There is some documentation (http://tabular-validator.readthedocs.org/en/latest/), but it is lacking in some areas. You are welcome to check out the code, run the tests (or check them on Travis), open an issue, or make a pull request to help us iterate to a version one release (here is the backlog).
We’ve also released some packages that are used in Tabular Validator: TVWeb (https://github.com/okfn/tabular-validator-web), JTSKit (https://github.com/okfn/jtskit-py), and TellMe (https://github.com/okfn/tellme). You can read more about each of these by following the links. A more complete blog post on the Labs blog will follow shortly.
More information about the okfn-labs