[okfn-discuss] Help needed with visualization

Guo Xu digitalepourpre at gmail.com
Tue Jan 24 16:25:47 UTC 2012

Hi folks,

I have been working on visualizing the networks of academic publishing
in economics. Here's an example for the Quarterly Journal of


A link indicates that two economists have published together in the
QJE. The strength of a link is defined by how many times they have
published together.

The size of the node indicates how many times an author has published
in the QJE. Bigger nodes have published more often.

Finally, the color indicates the ranking of the economist's alma
mater. Blue indicates that the author obtained his/her PhD from a top
10 university (according to
orange indicates a top 11-20 university; green is for top 21-30 and
red is for all universities beyond top 30.

Couple of interesting points:

- It seems that the core (those at the centre) are almost all made up
by top 10 authors. They tend to be well-connected.

- The hubs are: Phillipe Aghion, Daron Acemoglu, Marianne Bertrand

- There are rarely authors beyond the top 30 who get published in the QJE.

The visualization is done with D3. But it is very slow on older
computers. Does anyone have ideas for optimizing this?

Also, I have a lot more characteristics lying around that can be
displayed (e.g. gender - btw only 10% of the authors are female), but
I do not really know how to do it dynamically.

Finally, I would ideally like to do the same visualization for the
*entire* network of economist. I have a 300 MB dataset scraped from
Repec that gives me information on co-authoring for virtually all
economics journals and working paper series. But obviously this will
be too slow to visualize so it would be great if someone had
experience in working with such big datasets (the whole dataset has
~30.000 economists, which results in a 30.000 x 30.000 data matrix!!)

Anyway, let me know what you think and looking forward to suggestions!


More information about the okfn-discuss mailing list