[ckan-dev] Dealing with large datasets plus slow loading resource pages

Nigel Babu nigel.babu at okfn.org
Thu May 29 08:17:55 UTC 2014


Hi Phil,

I've filed an issue in the ideas repository to talk about this for future
roadmap. I've linked to this email. Please feel free to add more thoughts in
there.

https://github.com/ckan/ideas-and-roadmap/issues/61

On Fri, May 23, 2014 at 10:19:48AM +0100, Philip Cross wrote:
> For interest, we have implemented a suggestion made on the list for
> dealing with datasets containing large numbers of resources; where the
> package pages for these datasets were loading very slowly. We have
> created multiple packages per dataset: each package representing a
> folder in the separate datastore we are pointing at and with the
> packages linked via child-parent relationships.
>
> The main issue we faced with this was the confusing number of results
> that come up for the /datasets/ search so we had to introduce a 'top
> level' metadata element to filter searches with by default. We also
> had to introduce a caching mechanism to store the generated tree
> structures you can see in the top level packages.
>
> The repository is still not public but can be seen at:
>
> http://databris-ui.ilrt.bris.ac.uk/
>
> A prime example of a large dataset is:
> http://databris-ui.ilrt.bris.ac.uk/dataset/13kidnrls4jnl1m806eyfd8h6z
>
> We still have issues with folders that contain too many resources such as:
> http://databris-ui.ilrt.bris.ac.uk/dataset/3dddbb60a6e97ec97b67af15e0ab36d9
>
> where the page is taking about 50 sec to load.
>
> There is an interesting further problem where the resource pages for
> these large packages are also loading very slowly, e.g.
>
> http://databris-ui.ilrt.bris.ac.uk/dataset/3dddbb60a6e97ec97b67af15e0ab36d9/resource/7d60d92c-e684-4619-87e7-16744772ea2a
>
> - is this because the package is being loaded first in the background,
> with all the other resources metadata?
>
> We are still using version 2.0 but I'm assuming there wouldn't be a
> speed improvement with 2.2.
>
> Our solution helps but is not ideal and I think the issue of large
> numbers of resources does still need addressing.
>
> Cheers,
> Phil
>
>
> ---------------------------------
> Phil Cross
> Senior Technical Researcher
> IT Services R&D/ILRT
> University of Bristol
> 8 - 10 Berkeley Square
> Bristol, BS8 1HH
> Tel: +44 (0)117 331 4391
> Fax: +44 (0)117 331 4396
> E-mail: phil.cross at bristol.ac.uk
> URL: http://www.bris.ac.uk/ilrt/people/person/philip-a-cross
> Skype: philip_cross
>
> Please note I work for Bristol University on Tuesdays, Thursdays and Fridays
> and I may not be able to respond to emails received on other days.
> -----------------------------------------------
> _______________________________________________
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
> https://lists.okfn.org/mailman/listinfo/ckan-dev
> Unsubscribe: https://lists.okfn.org/mailman/options/ckan-dev

--
Nigel Babu
Developer, Open Knowledge



More information about the ckan-dev mailing list