The Map of CPAN is really just meant to be a bit of fun. Search sites like search.cpan.org and the shiny new metacpan.org exist to help you find what you’re looking for. The Map of CPAN allows you to find things you weren’t looking for. Zoom in on things that catch your eye. Explore patterns. Follow links and dependency chains. Discover stuff. Surprise yourself. It’s about serendipity!
The Map of CPAN was inspired by xkcd (like about 50,000 other things). Randall Munroe’s 2006 Map of the Internet may look laughably dated now – it still shows unallocated IPV4 address space! But it was a fun concept and set the seed of creating maps out of virtually nothing.
Short story: Nothing.
Longer story: The map was created by taking the names of all the distributions of Perl modules on CPAN and arranging them in alphabetical order. This has the effect of grouping together distributions that share the same top-level namespace. The long list of names (a single dimension) was then mapped onto a two dimensional plane using the Hilbert curve algorithm. The important property of the algorithm is that a sequence of consecutive items from the list will end up physically close to each other on the plane. Areas on the map are then assigned colours similar to the way colours are used on political maps to distinguish the borders of adjacent countries.
To reduce the visual clutter, only the larger namespaces are coloured. Namespaces containing 30 or more distributions are said to have reached critical mass and are assigned a colour which differs from the colours of all the immediate neighbours. Namespaces with fewer modules are submerged below the light blue primordial soup. Over time as new modules are uploaded, new areas will rise; continental drift will move and reshape the existing areas; and existing areas will change colour.
Each small square on the map represents one distribution that is available for download from CPAN. A distribution is a TAR or ZIP file containing one or more Perl modules and associated metadata. Each distribution was uploaded by a maintainer – that might be the original author of the modules or it might be someone else who has taken over the job of maintenance.
When you move your mouse over the map, the name of the distribution* and its maintainer will be displayed at the top of the page. You can ...
* Perl modules use double-colons in their names like ‘Algorithm::Knapsack’, whereas distributions usually use hyphens like ‘Algorithm-Knapsack’. The map application uses colons everywhere even though it’s mapping distributions. You probably wouldn’t find that confusing at all if it wasn’t for this note.
The source code for the map of CPAN is available on github, so you can run it locally, add cool features and contribute back to the project.
Thanks to Victor Wang and Jiaheng Wang – students at the Catalyst 2012 Open Source Academy.
Mapping to the Hilbert curve was done using Kevin Ryde’s amazingly comprehensive Math::PlanePath – thanks Kevin!
The map was rendered as a PNG using Lincoln Stein’s GD module – thanks Lincoln!
All the cool data look-ups are made possible by the awesome metacpan.org site which is a fairly young site but has delivered an amazing service in a very short time. Thanks to Olaf Alders and the other contributers, many of whom hang out on #metacpan on irc.perl.org and have helped me with API queries.
For an alternative visualisation of CPAN and its inter-dependencies, check out the CPAN Explorer.