[datacatalogs] Proposal: migrating datacatalogs.org to a new simpler setup

Andrew Ferlitsch aferlitsch at gmail.com
Tue Jul 22 17:38:22 UTC 2014


I thought I mention that I made an update to my data catalog (
http://www.opengeocode.org/opendata/). I refined my classification of
portal types to distinquish between the 'general' open data portals vs.
transparency portals (government spending), census and other statistics
portals, GIS and gazetteer portals. These seem to be the big four types.

I've added some radio buttons to view the catalog by portal type. My CSV
download of the list still downloads the whole catalog. I plan to update it
to download just the catalog for the selected portal type.

Andrew Ferlitsch
Co-Founder, OpenGeoCode.Org


On Tue, Jul 22, 2014 at 4:32 AM, Thomas Levine <_ at thomaslevine.com> wrote:

> The only thing I'd like with datacatalogs.org
> is for the data catalogs to be correct. But I
> don't care that much.
>
> I could set OpenPrism and pluplusch
> http://openprism.thomaslevine.com
> https://pypi.python.org/pypi/pluplusch
>
> to load their catalogs lists from some
> magic CSV file or whatever. Would this
> increase anyone's motivation to keep the
> information up to date?
>
> On 22 Jul 01:10, Philip Ashlock wrote:
> > So it seems that we (myself included) probably don't have enough time or
> > motivation for building or maintaining something with extra attention to
> > usability and maybe that's not actually a huge problem, we'll see. Where
> > are we in terms of just using git or google docs? Looks like the CSV in
> > github hasn't been updated for about a year. As far as google docs, has
> > anyone tried any of the more complex collaborative arrangements with
> google
> > fusion tables? Seems like one option could be segmenting editing rights
> per
> > region or something (eg canada), if that adds any value -
> > https://support.google.com/fusiontables/answer/2584135?hl=en
> >
> >
> > On Sun, Jun 22, 2014 at 9:20 PM, James McKinney <james at opennorth.ca>
> wrote:
> >
> > > Yeah, it sounds like a fair bit of work, especially for anyone
> unfamiliar
> > > with the current deployment.
> > >
> > > For my own selfish needs, I just need to easily edit and export the
> > > Canadian catalogs, which the new solution promises to do.
> > >
> > > James
> > >
> > > On Jun 9, 2014, at 7:30 AM, Rufus Pollock <rufus.pollock at okfn.org>
> wrote:
> > >
> > > On 6 June 2014 21:05, James McKinney <james at opennorth.ca> wrote:
> > >
> > >> Can we make a list of what’s wrong with the current datacatalogs.org,
> > >> and what would need to change for it to be satisfactory? I don’t think
> > >> we’ve yet described explicitly what’s bad about the current website.
> > >>
> > >
> > > Here's a couple of examples
> > >
> > > A. Easily adding new fields to catalog entries e.g.
> > >
> > > - Lon/Lat
> > > - Size
> > > - Start Date
> > > - Official status
> > >
> > > Then using these in faceting.
> > >
> > > B. Modifying the theme or specific pages (e.g. adding a map on the
> front
> > > page)
> > >
> > > C. Moving to free or close to free hosting (the new app as a very
> > > light-weight nodejs app can run on heroku free tier).
> > >
> > > To give a bit of context: current site is built on CKAN with a custom
> > > extension / theme in 2011 and has seen marginal updates. CKAN is
> awesome
> > > but is much more sophisticated than we probably need here. Moreover,
> > > because the CKAN extension is so old, quite a bit of work would be be
> need
> > > to get it upgraded (ready for doing further modifications)
> > >
> > > Rufus
> > >
> > > Once we have a list, it will be easier to commit to saying “I will help
> > >> close issue X”.
> > >>
> > >> James
> > >>
> > >> On Jun 6, 2014, at 3:56 PM, Andrew Ferlitsch <aferlitsch at gmail.com>
> > >> wrote:
> > >>
> > >> This is my first timer responding to a thread for this mailing list.
> > >> First, I like to say thanks for the plug on the pretty simple user
> > >> interface I have on my catalog of open data portals. I upgraded it a
> little
> > >> to make it easier viewing (more tabular) and sortable (radio selection
> > >> boxes). One problem I do have with no-login requirement for
> submission is
> > >> that I get a least one spam submission a day (argh). The user (or
> bot) is
> > >> using obfuscated urls so I can't detect them automatically by
> keywords. I
> > >> will need to start tracking IP and put in place protection from MySQL
> > >> injection.
> > >>
> > >> For this purpose, a simple CSV file for both submission collection and
> > >> for the catalog work well for me. My whole user interface is
> > >> auto-constructed from a PHP script that reads the CSV file.
> > >>
> > >> I can understand your concerns about editing. I took a similar tack
> here.
> > >> I put together a generic editing form, which is then populated by a
> PHP
> > >> script from the same CSV file. The text and dropdown boxes allow me to
> > >> modify values from the form and then resave back to the CSV file. I
> press
> > >> the MAKE INDEX button and the catalog is fully reconstructed. Below
> is a
> > >> screenshot of the editing form.
> > >>
> > >> Andrew Ferlitsch
> > >> Co-Founder, opengeocode.org
> > >>
> > >>
> > >> On Fri, Jun 6, 2014 at 2:50 AM, Rufus Pollock <rufus.pollock at okfn.org
> >
> > >> wrote:
> > >>
> > >>> On 2 June 2014 21:52, James McKinney <james at opennorth.ca> wrote:
> > >>>
> > >>>> The features you describe are more-or-less all on the current
> > >>>> datacatalogs.org. It’s just that datacatalogs.org has accumulated
> > >>>> technical debt.
> > >>>>
> > >>>> It seems that Rufus wants to solve the technical debt by rewriting
> it
> > >>>> as a thin git-based system that throws away all the features you
> mention.
> > >>>>
> > >>>
> > >>> I feel we would get a lot of mileage with the google spreadsheets
> option
> > >>> and deliver most of the other features
> > >>>
> > >>>
> > >>>> I would agree that it’s better to keep the features, and to just pay
> > >>>> the price of the debt…
> > >>>>
> > >>>
> > >>> Are the folks out there willing to help manage that debt (along with
> me)?
> > >>>
> > >>> Rufus
> > >>>
> > >>>
> > >>>>
> > >>>> James
> > >>>>
> > >>>>
> > >>>> On Jun 2, 2014, at 4:43 PM, Philip Ashlock <phil at civicagency.org>
> > >>>> wrote:
> > >>>>
> > >>>> For most users I don't think git or Google spreadsheets would be
> > >>>> simpler or very useful, but maybe the "simpler" was only referring
> to
> > >>>> maintaining the site. I agree with James' criteria although I might
> rank
> > >>>> good search/API interfaces higher than versioning. Github certainly
> doesn't
> > >>>> provide interfaces for adding or editing that are very user
> friendly for
> > >>>> managing CSV data and Google Spreadsheets (or any spreadsheet
> interface)
> > >>>> isn't very useable either. I guess a Google spreadsheet form would
> provide
> > >>>> a minimal level of useability, but that would only work for
> submissions,
> > >>>> not edits.
> > >>>>
> > >>>> I think we'd be better off with a traditional CRUD app with well
> > >>>> designed UI for submissions and edits than either of those options,
> but if
> > >>>> you wanted git functionality you could provide bi-directional sync
> to
> > >>>> github and treat the github copy as canonical. I'd still want a
> basic API
> > >>>> though.
> > >>>>
> > >>>> For some recent precedents for doing bidirectional github sync with
> a
> > >>>> CMS see:
> > >>>> https://konklone.com/post/writing-in-public-syncing-with-github
> > >>>> https://github.com/benbalter/wordpress-github-sync
> > >>>>
> > >>>> For me the ideal would be:
> > >>>>
> > >>>>
> > >>>>    - Submissions could be made without a user account but they get
> > >>>>    moderated. First via Akismet for spam filtering and then by human
> > >>>>    verification. Unmoderated submissions could still be public but
> with
> > >>>>    mechanisms to reduce abuse (eg on a separate URL blocked by
> search indexes
> > >>>>    with robots.txt and without any URLs being linked)
> > >>>>    - Edits could be made through a similar process or directly with
> > >>>>    approved user accounts
> > >>>>    - Everything would be accessible via full text search as well as
> an
> > >>>>    API with basic filtering options
> > >>>>    - Github syncing could be an optional alternative way to make
> > >>>>    submissions/edits
> > >>>>
> > >>>> For what it's worth, it looks like
> http://www.opengeocode.org/opendata/
> > >>>> provides a pretty simple interface and currently appears more
> comprehensive
> > >>>> than datacatalogs.org. There's also a list of other precedents at
> > >>>>
> http://wiki.civiccommons.org/Initiatives#Comprehensive_Lists_of_Open_Government_Data_Catalogs
> > >>>> though many have been abandoned
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, May 9, 2014 at 1:19 PM, Rufus Pollock <
> rufus.pollock at okfn.org>
> > >>>> wrote:
> > >>>>
> > >>>>> On 9 May 2014 18:02, Ross Jones <ross at servercode.co.uk> wrote:
> > >>>>>
> > >>>>>>
> > >>>>>> On 9 May 2014, at 15:05, Rufus Pollock <rufus.pollock at okfn.org>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>> *Running Code*
> > >>>>>>
> > >>>>>> I'm able to put my money where my mouth is here :-) I have a
> running
> > >>>>>> demo:
> > >>>>>>
> > >>>>>> http://new.datacatalogs.org/
> > >>>>>>
> > >>>>>>
> > >>>>>> http://new.datacatalogs.org/catalog/caib_es has an error
> > >>>>>> https://github.com/okfn/datacatalogs.org/pull/20 fixes. Needs
> more
> > >>>>>> meta-data on the detail page.
> > >>>>>>
> > >>>>>
> > >>>>> thanks for the fix and now deployed.
> > >>>>>
> > >>>>>  Should also root out dead portals, there are one or two, should be
> > >>>>>> marked as dead rather than removed I guess.
> > >>>>>>
> > >>>>>
> > >>>>> Nice to get version control in before we do that properly ...
> (plus i
> > >>>>> need to pull latest set from live datacatalogs.org)
> > >>>>>
> > >>>>> Rufus
> > >>>>>
> > >>>>>>
> > >>>>>> Ross
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> * Rufus Pollock Founder and President | skype: rufuspollock |
> > >>>>> @rufuspollock <https://twitter.com/rufuspollock> Open Knowledge
> > >>>>> <http://okfn.org/> - see how data can change the world **
> http://okfn.org/
> > >>>>> <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open
> Knowledge on
> > >>>>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
> > >>>>> <http://blog.okfn.org/>*
> > >>>>>
> > >>>>> The Open Knowledge Foundation is a not-for-profit organisation.
>  It is
> > >>>>> incorporated in England & Wales as a company limited by guarantee,
> with
> > >>>>> company number 05133759.  VAT Registration № GB 984404989.
> Registered
> > >>>>> office address: Open Knowledge Foundation, St John’s Innovation
> Centre,
> > >>>>> Cowley Road, Cambridge, CB4 0WS, UK.
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> data-catalogs mailing list
> > >>>>> data-catalogs at lists.okfn.org
> > >>>>> https://lists.okfn.org/mailman/listinfo/data-catalogs
> > >>>>> Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
> > >>>>>
> > >>>>>
> > >>>> _______________________________________________
> > >>>> data-catalogs mailing list
> > >>>> data-catalogs at lists.okfn.org
> > >>>> https://lists.okfn.org/mailman/listinfo/data-catalogs
> > >>>> Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
> > >>>>
> > >>>>
> > >>>>
> > >>>
> > >>>
> > >>> --
> > >>> * Rufus Pollock Founder and President | skype: rufuspollock |
> > >>> @rufuspollock <https://twitter.com/rufuspollock> Open Knowledge
> > >>> <http://okfn.org/> - see how data can change the world **
> http://okfn.org/
> > >>> <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open
> Knowledge on
> > >>> Facebook <https://www.facebook.com/OKFNetwork> |  Blog
> > >>> <http://blog.okfn.org/>*
> > >>>
> > >>> The Open Knowledge Foundation is a not-for-profit organisation.  It
> is
> > >>> incorporated in England & Wales as a company limited by guarantee,
> with
> > >>> company number 05133759.  VAT Registration № GB 984404989. Registered
> > >>> office address: Open Knowledge Foundation, St John’s Innovation
> Centre,
> > >>> Cowley Road, Cambridge, CB4 0WS, UK.
> > >>>
> > >>> _______________________________________________
> > >>> data-catalogs mailing list
> > >>> data-catalogs at lists.okfn.org
> > >>> https://lists.okfn.org/mailman/listinfo/data-catalogs
> > >>> Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
> > >>>
> > >>>
> > >> <edit.jpg>
> > >>
> > >>
> > >>
> > >
> > >
> > > --
> > > * Rufus Pollock Founder and President | skype: rufuspollock |
> > > @rufuspollock <https://twitter.com/rufuspollock> Open Knowledge
> > > <http://okfn.org/> - see how data can change the world **
> http://okfn.org/
> > > <http://okfn.org/> | @okfn <http://twitter.com/OKFN> | Open Knowledge
> on
> > > Facebook <https://www.facebook.com/OKFNetwork> |  Blog
> > > <http://blog.okfn.org/>*
> > >
> > > The Open Knowledge Foundation is a not-for-profit organisation.  It is
> > > incorporated in England & Wales as a company limited by guarantee, with
> > > company number 05133759.  VAT Registration № GB 984404989. Registered
> > > office address: Open Knowledge Foundation, St John’s Innovation Centre,
> > > Cowley Road, Cambridge, CB4 0WS, UK.
> > >
> > >
> > >
> > > _______________________________________________
> > > data-catalogs mailing list
> > > data-catalogs at lists.okfn.org
> > > https://lists.okfn.org/mailman/listinfo/data-catalogs
> > > Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
> > >
> > >
>
> > _______________________________________________
> > data-catalogs mailing list
> > data-catalogs at lists.okfn.org
> > https://lists.okfn.org/mailman/listinfo/data-catalogs
> > Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
>
> _______________________________________________
> data-catalogs mailing list
> data-catalogs at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/data-catalogs
> Unsubscribe: https://lists.okfn.org/mailman/options/data-catalogs
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/data-catalogs/attachments/20140722/967536df/attachment-0003.html>


More information about the data-catalogs mailing list