[ckan-dev] Decision on form framework

Rufus Pollock rufus.pollock at okfn.org
Thu Feb 24 10:14:27 UTC 2011


Thanks to Seb's excellent research and evaluation work we now have a
decision on data serialization and form stuff:

<http://trac.ckan.org/ticket/926>

Summary is below -- this is probably worthy of a blog post but that can wait :)

Rufus

### Goals

We want the interface for updating an object to be loosely coupled to
the method for updating it.

We might update a Package from:

- HTML forms
- a REST API (using JSON)
- a CLI (potentially using command line arguments, YaML, XML or ini
files)

Right now, data is validated using a form framework, even if we're not
using forms.  Data is written to the object as part of the forms
framework (using the "sync()" method), making the process hard to
customise and hard to discover.

Instead, there should be a standard chain for:

- deserialising untyped data (such as that received from an HTTP POST
 or parsed from a YaML file) into valid data
- returning structured errors suitable for displaying to the user
- saving the validated, deserialised data

Ideally, it would look something like:

    schema = MySchemaDefinition()
    raw_data = open("raw.csv", "r").read()
    structured_data = to_python(raw_data, schema)
    try:
       validated = validate(python_data)
       myobject.update_from_dict(validated)
       return "Updated OK"
    except ValidationError, e:
       return "Error: %s" % e.to_dict()

The inverse would be something like:

    structured_data = myobject.render_to_dict()
    raw_data.write(to_csv(structured_data, schema)
    print "Wrote CSV %s" % to_logformat(serialized_data, schema)

The question of how to generate and display forms should be completely
decoupled from this.  It should be easy to write forms by hand, which
means it should be simple to flatten the serialized data to key, value
pairs, and match up any validation errors to each key.

Optionally, a form widget generation framework is a nice-to-have, but
not essential, as it is expected that, given enough time, the majority
of forms will require manual coding to accomodate edge conditions.

A form widget generation framework should be reasonably complete if
it's worth trying at all, which means it should support things like:

- nested fields (at least repeating, multi-value fieldsets)
- widgets for dates and file uploads
- internationalisation

... but note I'd settle for *no* widget generation

### Components of a serialisation / validation framework

- a simple, obvious way to define a schema
- a lightweight validation implementation
 - simple interface for validators
 - easy to match validation errors to data structure items

Overall, I'd like to see:

- loose coupling, no framework dependencies
- maximal test coverage
- extensive documentation with readily available examples

### Findings

I looked at flatland, formencode, FormAlchemy, formish, WTForms, Django,
web2py, deform/colander, formconvert and web.py

- **web2py** just helps build HTML from python, so isn't what I'm after
at all
- **web.py** has rudimentary validation which is only aimed at HTML forms
and is hence tightly coupled with them.
- **Django**'s forms are again tightly coupled to HTML forms (and their
generation)
- **FormAlchemy** similarly couples validation to forms, and is focussed
on inferring a schema from a data model SQLAlchemy.
- **WTForms** again focuses on Form generation and don't make itx easy to
deserialise arbitrary data

This leaves us with Flatland, Formencode, Formish,
Colander/Peppercorn/Deform, and FormConvert.

Having reviewed all of these, I rejected Formencode on the basis of its
patchy documentation and relatively low unit test coverage.  I also found
it mixed concerns a bit much for my taste.

Formish felt similarly sparsely documented.

Of the remainder, I'd be happy using any of them, but opted for Colander
in the end as it has the most exhaustive documentation and unit tests and
has been used in production for a long time.  FormConvert has a nice
design but is a bit of a moving target at the moment -- worth revisiting
in the future.




More information about the ckan-dev mailing list