My current Perl Ironman Challenge status is: My Ironman Badge

Saturday, January 19, 2013

CloudPAN

It is funny that my returning post would concern clouds as I am actually writing this while above real clouds over nine kilometers up. While attending the Orlando Perl Workshop (aka. Perl Oasis), I was a selected speaker for a short talk about one of my modules: CloudPAN. I figured I would do a little bit of a write up about CloudPAN and why I think it is useful and how it could be even more useful.

But before getting to the point of talking about CloudPAN we need to take a slight detour and talk about the awesomeness that is MetaCPAN. If you've been living under rock for the past couple of years, you might not know that MetaCPAN is quite possibly one of best things to evolve out of the Perl world recently. MetaCPAN is important to the Perl community for a number of reasons including: it provides a really simple API for interrogating the CPAN indicies, it also provides an open source web platform for displaying CPAN information.

The project was born out of frustration at search.cpan.org. I don't have all of the history available to me at the moment, but it is sufficient to say that not every one was happy with how search.cpan.org worked. Not only that, people couldn't provide patches or even fork it if they wanted.

So how does MetaCPAN provide the magic that it does? It actually mostly relies upon ElasticSearch. ElasticSearch is a fantastic tool built on top of Lucene that provides all sorts of awesome scalability via clustering, multiple indexes, and basically takes Lucene and applies it in a tremendous way. In fact, the API that MetaCPAN provides is really just a thin wrapper around what ElasticSearch expects with a few bells and whistles that let us do things that enable modules like CloudPAN.

CloudPAN was really just a silly idea that came about during the QA Hackathon 2012 in Paris. David Golden wanted to enable the CPAN.pm client to talk to MetaCPAN in order to reduce it's footprint. If you can directly query some external service about available distributions and modules, there is no need to download large gzipped files and build out local caches. But! There was a snag. The MetaCPAN::API module actually had a bunch of deps that aren't exactly core. So part of my time spent was implementing MetaCPAN::API::Tiny that only relied upon HTTP::Tiny to talk to MetaCPAN. With HTTP::Tiny getting into core, that meant the CPAN.pm client could grow up without external deps.

And it was during that work that I noticed a rather curious method to the MetaCPAN API: file. This is how the source display works on the site, basically. The file api basically opens up the distribution, finds the file, and slurps in for you. The gears started turning at that point. Could I really write a dumb INC hook that made a call to MetaCPAN for the source to modules that weren't installed locally? Thus, CloudPAN was born. I aptly demonstrated pulling in Moo, building classes, using those classes, etc. all without having actually installing Moo.

It took me sometime to do a little clean up and throw together some docs. After all, it was just a useless hack (the most useless at the QA Hackathon). But then I found myself using it on a regular basis when trying out modules to get a feel for how they worked and if they provided the right solution to the problem I had. I figured people would also like the ability to try out modules. So it ended up on CPAN.

The current version even has the option to persist what you've downloaded previously. And when using it again, it will load from that location. No need to download the entire depenency graph again for a particular module just because you accidentally CTRL-D'd your re.pl shell.

I'd like to grow it up further and perhaps teach it do other things, such as fetching from your own local MetaCPAN installation. And maybe even have it do authentication and verification of the remote. This could eventually end up as a way to distribute pure Perl deps for scripts on the first run.

But I'll save that for another day. Give CloudPAN a try if you find yourself wanting to evaluate modules, but don't want to clutter your local perl installation with modules you'll never use.

No comments:

Post a Comment