[open-economics] Help needed with visualization

Thu Jan 26 22:41:42 UTC 2012

On 24 January 2012 16:25, Guo Xu <digitalepourpre at gmail.com> wrote:
> Hi folks,
>
> I have been working on visualizing the networks of academic publishing
> in economics. Here's an example for the Quarterly Journal of
> Economics:
>
> http://www.guoxu.org/econmap/map.html

Very cool.

> A link indicates that two economists have published together in the
> QJE. The strength of a link is defined by how many times they have
> published together.
>
> The size of the node indicates how many times an author has published
> in the QJE. Bigger nodes have published more often.
>
> Finally, the color indicates the ranking of the economist's alma
> mater. Blue indicates that the author obtained his/her PhD from a top
> 10 university (according to
> http://www.topuniversities.com/university-rankings/world-university-rankings/2011/subject-rankings/social-sciences/economics);
> orange indicates a top 11-20 university; green is for top 21-30 and
> red is for all universities beyond top 30.

Could you post the underlying dataset (or at least info about it) on
the DataHub?

> Couple of interesting points:
>
> - It seems that the core (those at the centre) are almost all made up
> by top 10 authors. They tend to be well-connected.
>
> - The hubs are: Phillipe Aghion, Daron Acemoglu, Marianne Bertrand
>
> - There are rarely authors beyond the top 30 who get published in the QJE.
>
> The visualization is done with D3. But it is very slow on older
> computers. Does anyone have ideas for optimizing this?

There isn't much you can optimize because graph layout is in d3. Question is:

Do you need dynamic, force-directed js viz? If not you can move to
static (will be faster to render and you can do more complex stuff).
I'd suggest using networkx (it's what I used for visualizing patent
and paper networks in the past). Alternatively worth looking at gephi.

> Also, I have a lot more characteristics lying around that can be
> displayed (e.g. gender - btw only 10% of the authors are female), but
> I do not really know how to do it dynamically.
>
> Finally, I would ideally like to do the same visualization for the
> *entire* network of economist. I have a 300 MB dataset scraped from
> Repec that gives me information on co-authoring for virtually all
> economics journals and working paper series. But obviously this will
> be too slow to visualize so it would be great if someone had
> experience in working with such big datasets (the whole dataset has
> ~30.000 economists, which results in a 30.000 x 30.000 data matrix!!)

I actually have fairly extensive experience (though now 2y out of date):

<http://rufuspollock.org/2009/10/15/exploring-patterns-of-knowledge-production-2/>

> Anyway, let me know what you think and looking forward to suggestions!

Perhaps we can catch up at the Economics Hackday on Saturday [1].

Rufus

[1]: http://blog.okfn.org/2012/01/18/open-economics-hack-day-saturday-january-28th-2012/