[ckan-dev] API Performance

Ross Thompson ross.thompson.ca at gmail.com
Tue Jun 25 13:23:03 UTC 2013


While there have certainly been performance improvements, I think we are
finding using the API to edit and create datasets is still somewhat slow,
and seems to be getting slower now that the number of revisions is growing.

While we used some of the techniques mentioned in the wiki, the biggest
help was Ian W's dataload utility that let us take advantage of our
multi-core servers. But doing bulk updates, dropping indexes, etc is
problematic, or at least awkward, on a live site, so what would really help
is increased performance in the API itself.

Thanks,


On 25 June 2013 06:57, Adrià Mercader <adria.mercader at okfn.org> wrote:

> (sorry sent the email before finishing)
>
> The discussion I was mentioning:
>
> https://github.com/okfn/ckan/issues/681
>
> Adrià
>
>
>
> On 25 June 2013 11:55, Adrià Mercader <adria.mercader at okfn.org> wrote:
> > Hi,
> >
> > Just tangentially related, I booted up a wiki page with performance
> > tips for large imports:
> >
> > https://github.com/okfn/ckan/wiki/Performance-tips-for-large-imports
> >
> > This is probably not relevant to the case that you mention, but the
> > points on the wiki were raised during a very interesting discussion
> > started by Ian Ward that can shed some light in possible bottlenecks:
> >
> >
> >
> >
> > On 25 June 2013 11:15, Toby Dacre <toby.okfn at gmail.com> wrote:
> >> On 25 June 2013 10:50, Ross Jones <ross at servercode.co.uk> wrote:
> >>> Hi Toby,
> >>>
> >>> As Ross (T) raised API performance in another thread, I thought I'd
> ask - is anything happening about API performance generally? Read as well
> as write.
> >>>
> >>> The APIController can be quite memory hungry when returning large
> results and I've had some small success in using a generator instead of
> passing around large strings. I imagine that there might be an efficient
> way of using a generator to return each item for the results (as it is
> encoded) rather than directly encoding a blob as json in memory.
> >>
> >> This does sound like something we should be looking at.  Maybe it is a
> >> paging issue as we shouldn't be returning so much stuff that the in
> >> memory stuff is an issue.  Are there particular api calls that are
> >> really bad?
> >>
> >> Much of the performance problem as I see it is that often we get data
> >> and then throw it away eg getting a list of items will get the whole
> >> items and then do `return [x['id'] for x in items]`
> >>
> >>>
> >>> Ross (J).
> >>>
> >>>
> >>> On 24 June 2013 16:58, Ross Thompson <ross.thompson.ca at gmail.com>
> wrote:
> >>>> 2. Performance issues when dealing with 180000+ datasets. Loading and
> >>>> updating larges numbers of datasets through the API can take a
> weekend.
> >>>
> >>>
> >>> _______________________________________________
> >>> ckan-dev mailing list
> >>> ckan-dev at lists.okfn.org
> >>> http://lists.okfn.org/mailman/listinfo/ckan-dev
> >>> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
> >>
> >>
> >>
> >> --
> >> Toby Dacre
> >>
> >> The Open Knowledge Foundation
> >>
> >> Empowering through Open Knowledge
> >> http://okfn.org/  |  @okfn
> >>
> >> _______________________________________________
> >> ckan-dev mailing list
> >> ckan-dev at lists.okfn.org
> >> http://lists.okfn.org/mailman/listinfo/ckan-dev
> >> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> http://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: http://lists.okfn.org/mailman/options/ckan-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.okfn.org/pipermail/ckan-dev/attachments/20130625/e150a007/attachment-0001.html>


More information about the ckan-dev mailing list