Tuesday, June 9, 2009

Automagically Generating a Guide

fREW suggests some clever automatic ways to generate a guide to CPAN. Certainly if it can be done well, that is preferable to the inevitable bickering that would result from a non-objective guide!

I did have one additional notion that might help, thought I can't decide if it is crazy or not. What about some sort of opt-in anonymous module usage monitoring, ala Last.fm's scrobbling? It could help figure out which modules are in common use -- imagine doing a CPAN search sorted by usage. It could be another handy stat feeding into an automatically generated guide. It would also allow easy searches for orphaned modules that are in heavy use.

Obviously there are privacy objections, potential speed issues, worries about gaming the system, etc...


  1. Simply aggregate the Apache logs from the various CPAN mirrors and evaluate them. You'll get a nice picture of the usage of all modules.

  2. I don't think that would be nearly as effective -- it doesn't provide any distinction between the modules you download once to try and then reject, and those that are an essential part of your daily toolkit.

    I mean, I must have downloaded at least half of the MP3 tag tools on CPAN sometime or the other, but all but one of them were used at most once, and that one I use very infrequently. A count of how many weeks I run code using each module would be much more meaningful than how many times I downloaded them.

  3. Doing something along the lines of popularity-contest from Debian might give us the best results, as that kind of code measures usage based on how often you actually use the module, not only if you have it installed.