[wdmmg-dev] Offener Haushalt data audit example and mongoaudit tool
    Stefan Urbanek 
    stefan at knowerce.com
       
    Wed Dec 15 18:10:38 UTC 2010
    
    
  
Hi,
For better understanding of wdmmg data, I've created MongoDB auditing tool. The tool can produce output like this:
pdf_link:
	storage type: unicode
	present values: 22 (95.65%)
	null: 0 (0.00% of records, 0.00% of values)
	empty strings: 0
...
flow:
	storage type: unicode
	present values: 19248 (100.00%)
	null: 0 (0.00% of records, 0.00% of values)
	empty strings: 0
	distinct values:
		'spending'
		'income'
More examples:
	http://democracyfarm.org/f/wdmmg/mongoaudit/
If anyone would like to try it on other datasets (such as wdmmg uk dataset), here is the source:
	https://github.com/Stiivi/brewery-py
just install it with: python setup.py and you will get the tool:
usage: mongoaudit [-h] [-H HOST] [-p PORT] [-t THRESHOLD] [-f {text,json}]
                  database collection
Audit a MongoDB collection
positional arguments:
  database
  collection
optional arguments:
  -h, --help            show this help message and exit
  -H HOST, --host HOST  host
  -p PORT, --port PORT  port
  -t THRESHOLD, --threshold THRESHOLD
                        threshold for number of distinct values (default is
                        10)
  -f {text,json}, --format {text,json}
                        output format (default is text)
Example:
	mongoaudit wdmmg entities
	mongoaudit --format json wdmmg entries
If you need any help, or if you have any suggestions or question let me know.
Stefan
freelance consultant, analyst
knowerce
http://www.knowerce.sk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/openspending-dev/attachments/20101215/b10c8d9f/attachment-0001.html>
    
    
More information about the openspending-dev
mailing list