Last of the Careless Men: August 2009

Sunday, August 30, 2009

Num and +: Roles not always the best approach?

An interesting subject came up on #perl6 this week: .Num and prefix + do not do the same thing. The difference is that .Num always returns a Num or an error. + is much more fuzzily defined (so far as I know): + returns something that you can do arithmetical operations on.

This may sound like splitting hairs, but when you realize that Num is supposed to represent real numbers, you can see it captures an important distinction. For instance, if $z is a Complex, then $z.Num is an error (not even implemented!). On the other hand +$z is a perfectly good Complex number.

This ties into something I mentioned back in the middle of the Vector posts. As an old school C++ programmer, I'm used to thinking of class inheritance as the natural way to write routines which can work on data types you know nothing about, and templates as a tricky alternative which, if you can make it work, significantly reduces the couplings between your classes and your users'. As I understand it, in Perl 6 code the former would be commonly implemented as a role. But crazily enough, the equivalent of the template option is what you get when you just program normally in Perl 6!

Consider a generalized dot product function which takes two arrays of equal length, pairwise multiplies their contents and sums the result. I've been programming C++ for twenty years now, yet the thought of trying to write a properly general version of that in C++ template form stumps me. How the heck can you easily specify that the return value is the type you get when you do the sum? I'm sure there's a way of working it out, but it's decidedly non-trivial. Whereas in Perl 6 it really is trivial:

In fact, in Perl 6 you have to do more to restrict what types the function works on! By default it will work on any combination of things that can be multiplied and added. (In fact, it doesn't have be consistent types: if one array is mixed complex numbers and reals, and the other ints, rationals, and number strings, Perl 6 will automatically sort it all out and do the right thing.) (At least in theory -- your mileage may vary in current builds of Rakudo.)

It seems to me that as Perl 6 programmers, our default position should be to be as general as reasonably possible. Certainly if you are writing math functions, unless you know a variable has to be a real for the math to work, you should not declare it to be a Num, nor should you use the .Num method to make sure it is something you can do math on. For the former, Any or no declared type is better; for the latter, prefer prefix +. Specifying parameter types is mostly useful for controlling overloading operators and functions.

By the same token, those of us implementing numeric classes need to provide as standardized an interface as possible. This includes not only the obvious operators, but prefix + as well -- overloading it is basically announcing to the world "I can do math!" Likewise, you should probably overload the Num method if your type can sensibly be converted to a real. (Mind you, I have no idea how to do that at the moment -- Num is definitely under-spec'ed.)

Now that I've written this, I guess I should go look at implementing these for Vector, eh?

UPDATE: Apparently prefix + cannot be overloaded at the moment, and just calls Num internally. Drat.

WARNING: As far as I know, Num and + are not spec'ed anywhere yet. So this is mostly my interpretation of what I have seen in practice and heard discussed on #perl6. It is both subject to me being completely wrong and being changed in the spec.

Friday, August 28, 2009

Vector: Well, I thought I was done....

Yesterday, pmichaud++ fixed the operator overloading bug. I can confirm that using this morning's Rakudo build, V+, V-, V*, and V/ can be changed to the more expected +, -, *, and /. I haven't pushed the change to github yet -- I'm not sure what the right policy is in dealing with forcing any potential Vector users to be on the bleeding edge.

Though as nearly as I can tell, today's Rakudo works perfectly on my MacBook -- actually passes all the spectests, unlike the last official release. The only downside is one of the other Vector tests now fails. It's not exactly a crucial test, but it is a puzzling failure to me. I'm trying now to find out whether the test is wrong or something has broken in Rakudo.

The other thing I learned about was class XXXX is also, which allows you to add additional methods to a class. I've used this to resolve Vector's minor DRY issue: now the definition of the dot product operators comes before the definition of Vector.Length, allowing Vector.Length to use dot product internally.

Wednesday, August 26, 2009

Vector: Testing

Our testing logic comes via the Perl 6 Test.pm module. This comes distributed with Rakudo, but initially I cloned it to Vector just to simplify setting up the tests. The magic of Configure.pm handles the path to Rakudo's Test.pm properly, so I've deleted the Vector's local version.

The top of the file sets everything up:

etc. I'm sure there's a better way to define is_approx_vector, but this version works okay. (Actually, if - worked properly for Vector, and we defined abs to be a synonym for Length, I believe Test.pm's is_approx would compare two Vectors quite nicely.) We use test * to declare we have no plan -- that means instead of specifying the number of tests to run, we have to use done_testing at the end of the tests. Then it defines a bunch of Vectors to use in the tests, and we are off and running.

I only use a tiny subset of the functions available in Test.pm. isa_ok let's me test if the type of the supposed Vector objects is actually Vector. is checks its first two parameters for string equality. Absolute equality is a bad idea with floating point numbers, so is_approx checks that the absolute difference between two numbers is less than a small epsilon. My is_approx_vector does the same sort of thing for two Vectors.

dies_ok takes a closure and tests that it dies as expected. For instance, I use this to test that taking the dot product of Vectors of different dimensions fails instead of generating a meaningless result. lives_ok does the opposite, testing that code doesn't die without specifying what it actually does.

Put them together and it's possible to do a decent set of tests. In some cases (like 7D cross product) I had no idea what the correct answer of the operation is (other than just duplicating the formula for it again, but that seems pretty useless), but I did test that the results have the expected properties. Overall at the moment my test suite runs 256 tests, which is a lovely number IMO.

And that's pretty much it for Vector for now. I'm going to tackle doing a few simple projects using it and see what I learn from them. And I promise to revisit Vector when changes to Rakudo make it work better.

Tuesday, August 25, 2009

Blushing

I see they're chatting about this blog on #perl6 at the moment. I am in fact a "#perl6 inhabitant" these days, but because this blog is supposed to be anonymous and I haven't figured out how to do anonymous IRC, I'm using a different name there. I'd rather not say more, but definitely the #perl6 gang has been a huge help figuring out how to do this stuff.

Monday, August 24, 2009

Vector: Proper file layout and makefile

In my last post I was worried about how to set up a Configure.pl for Vector. (See the comments of that post.) However, I have been able to piece together how to do it by raiding masak's grampa repo for the appropriate files.

Here's what I did:

1) Moved Vector.pm to a new lib/ directory.
2) Added grampa/lib/Configure.pm to lib/.
3) Added grampa/Configure and grampa/Makefile.in to Vector's root directory.
4) Edited Makefile.in to look for Vector.pm as the source rather than grampa.pm.

Then it's just a matter of the usual:
~/tools/rakudo/perl6 Configure
make
make test

And everything works!

Configure is smart enough to use perl6 to invoke Rakudo if your Perl 6 is set up that way. Mine's not, thus explicitly invoking it in the above. You may have to set the executable bit as well, I'm not sure how to do that in git.

A big hearty round of applause for masak++ and mberends++!

Update: I just realized I got the first step wrong. ./Configure won't do it for me (even now with the executable bit set in git); it needs to be

PERL6LIB=$PERL6LIB:lib ./Configure

so that the Configure.pm library gets picked up properly. Sorry for any difficulties this may have caused people.

Sunday, August 23, 2009

Vector: "Joining the Perl 6 Ecosystem"

After my comment yesterday about setting up a webpage pointing to the active projects coded in Perl 6 out there, the #perl6 channel reminded me about masak's proto. It is an attempt to create a simple CPAN-like tool for Perl 6. I had read about it months ago, run into some sort of difficulty using it, and just forgotten about it. But work has progressed, and masak and a few others are working on making it more powerful actively.

I've decided to try to get Vector on there, and document what I'm doing as I do it. Not that I think Vector is particularly brilliant, or that it should go in the Perl 6 CPAN when there is such a thing. But I hope it is at least both a pretty good example of how to do this sort of thing in Perl 6, and potentially a useful tool someone else could build on.

So I'm looking in the proto PIONEER file. It lists four conventions that need to be followed, the first of which is creating a deps.proto file. Vector doesn't have any dependencies; I'm not sure if having an empty file or no file is a better way of indicating that. I'm guessing an empty file, as that suggests that I have at least considered it. (Or better yet, with a comment indicating there are no dependencies?)

Next is building. Vector doesn't need a build stage, and PIONEER indicates that if there is no Makefile.PL or Configure.pl file, it just assumes the build worked, which sounds perfect for my purposes.

Step three is running tests. If there is a makefile, proto will make test. If there isn't, it will try to run prove recursively on the t/ directory. Assuming it's smart enough to run prove with the system's working Rakudo, this should work just fine with what we already have. Errr, assuming the LIB paths are set up properly.

Which is the last issue, I guess. I've just been running with Vector.pm and Test.pm in the top-level Vector directory, no need for a PERL6LIB environment variable. Will prove test figure out the paths automatically? And if I switch over to that system, should it be lib/Vector.pm or lib/Math/Vector.pm or something like that?

I think my next step is to check in what I've got now, make this post, e-mail a link to it to masak to get his comments, and head off to the pub. I will report on what happens later.

Saturday, August 22, 2009

Random Perl 6 Thoughts

A number of interesting thoughts flitted through my head when I was out walking this morning. (At least, I hope people find them interesting.) I jotted one line down for each when I got home, and now I'm trying to reconstruct them in full.

There are two major components I'd still like to get into Vector. Yesterday I started trying to figure out how to use postcircumfix to allow users to treat a Vector as is it is an array of coordinates. That is to say, instead of $vector.coordinates[1], you'd say $vector[1]. This is definitely supposed to be possible in Perl 6, but I was utterly unable to get it to work in Rakudo. Of course, I don't know that I was using the proper syntax. I definitely know I didn't understand the examples in the spec. (In my defense, there are no examples of how you'd use this feature in practice there.)

Second, I'm still frustrated by the Rakudobug that prevents you from overloading an operator using code that depends on another version of that operator. My solution of just choosing some Unicode character that gave the impression of standard ASCII character I really wanted to use was enough to get me going, but it offends my sense of elegance. I've decided to prefix the proper ASCII operator with "V" (for Vector) instead for now, which strikes me as slightly better, but still falling short of proper Perl 6. (If only I had another day each week so I could spend some time hacking Rakudo itself, going after this bug would be my first priority...)

Moving on to the bigger picture, I got to wondering if it might be worth having a web page that points to the github repos of the various Perl 6 coding projects going on. As much as anything this was prompted by seeing Moritz's cool post on visualizing match objects and not knowing where I could grab his SVG::MatchDumper code. I know it's probably too soon to have a Perl 6 CPAN, but it's never too soon to make it easy for people to find cool Perl 6 code.

That got me thinking about Masak's Druid project, and how cool it would be to write AI players for that. I've always wanted to try my hand at that sort of thing, but except for some brief and fun attempts in college I've never actually gotten around to it. Of course, not having any clue of workable Druid strategies would make this trickier.

Which made me think how fun it would be to have a Risk game in Perl 6, with hot-plug-able AI players so people could pit their code other's code in battle. I'm presuming there's some way to eval a class from a string, passing the result into a harness which controls that class's interaction with the main game structure....

Thursday, August 20, 2009

Vector: Philosophy

I thought I'd go over some of my design decisions for this project. The first, and most obvious, is that I decided to forgo efficiency in terms of scope. I'm not one of those people who say efficiency is not important; in my experience, when doing serious geometric work, code can never be too fast. But by the same token, it will be years before it might even make sense to think about using Perl 6 for serious numerical work. So it makes good sense here to shoot for versatility rather than speed. If for some reason you think it makes sense to have an 13-dimensional vector of complex numbers, Vector can do it.

I've also chosen to let Perl 6 generate errors like mismatched dimensions rather than catching them directly in code. For instance, I've only defined the cross product for 3D and 7D Vectors. If you try to call cross product on vectors of other dimensions, you will get a "No applicable candidates found to dispatch to" error message. Now that I think about this, I'm inclined to think this is the wrong answer. I could provide more awesome error messages without a whole lot more work. I should probably figure out the right way to do that.

There are two closely related classes that frequently go with a Vector class: UnitVector and Point. In Perl 6, UnitVector is a simple matter of saying subset UnitVector of Vector where { (1 - 1e-10) < $^v.Length < (1 + 1e-10) };. (I discussed the implications of doing it this way in my previous post.)

Point is a little more tricky. An N-dimensional point looks pretty much exactly like an N-dimensional vector. The only practical difference I'm aware of is how they handle being transformed. With a point, you factor in the change to the origin of the change of basis matrix. With a vector, you ignore the change to the origin. (There are other type distinctions, like a point subtracted from another point should be a vector, and it doesn't make sense to add two points. But in practice, these distinctions cause problems without any corresponding pluses that I can see.)

Most of the geometry libraries I've worked with ignore this distinction in the type system entirely, just making points and vectors two different names for the same class. My current inclination is to ignore points altogether, making everything a Vector. I think I will have to try implementing some actual code using Vector to see how this works out.

Another issue I'm pondering is Length and LengthSquared. I actually have multiple issues with these functions as implemented, and I haven't worked out what the proper solution is. As currently structured, they are in the wrong spot in the file. Both are most cleanly implemented in terms of dot product. But they are defined before dot product, because they are Vector class members while dot product is an external operator that works on the Vector public interface.

The obvious answer to this is to make Length and LengthSquared regular subs that take Vectors. But that brings on really weird conundrums of object-orientation. The only reason to have a LengthSquared function is that for positive x & y, x < y iff x*x < y*y, so you can save two square root calculations if all you're interested in is the relative lengths of two vectors. That might make sense if you've got an efficient low-level C++ implementation of a 2D vector. It's kind of silly if you're thinking of a more high-level implementation of, say, a 9D vector.

But worse, it actually amounts to exposing the underlying representation of the Vector! It is "an optimization" because calculating the length involves a square root call. But many sensible alternate representations of Vectors (like polar coordinates) will actually store the length directly -- in which case LengthSquared requires more calculations than Length, rather than fewer!

So, those arguments have convinced me to do away with the LengthSquared function altogether. (Not yet reflected in code as I type this.) On the other hand, they suggest to me that Length really should be a Vector member function, because while for this implementation of Vector it can efficiently be a non-member function, for many other implementations that would represent a serious pessimization. (I was going to say that on the other hand Unitize should be external, but there are cases when that would be a pessimization as well. Hmmm.)

As you've probably guessed by now, writing about these things forces me to think about them more deeply!

General: I've updated to the PDX release of Rakudo this morning, and Vector still passes all 204 tests! Though that number is likely to go down when I delete LengthSquared, because I'll need to remove some tests in the process.

Tuesday, August 18, 2009

Vector: Perl 6 is full of awesome

I've made a bunch of small changes since my last post, all of them delightful. First up, I finally switched has $.coordinates to has @.coordinates. This required switching all the usages of coordinates in the class definition. But much to my surprise, it did not require changing any of the usages outside the definition.

That's because the typical usage was something like $a.coordinates. What I realized this morning was that this is not a reference to the $.coordinates class member. Rather, it is a call to an automatically generated .coordinates method, which returns the $.coordinates member. That means switching to @.coordinates just means that .coordinates returns an array rather than a scalar reference to an array. No change at all is needed in how the method is used.

I realize that all the experienced Perl 6 hands just assumed I knew that when I wrote $a.coordinates. But I didn't, and it is a pleasant surprise.

My next discovery plays off of something I worked out days ago, but hadn't realized all the implications of until this morning. If you define operator +, Perl 6 automatically generates operator += for you. Now, presumably it just internally translates $a += $b to $a = $a + $b. This means the behavior of += is very different in C++ and Perl 6.

In C++, if you say a += b, a is still the same object before and after the operation; it just has a different value afterward. In Perl 6, $a actually is a different object after $a += $b. It doesn't even have to have the same type as it did before. Consider these two tests I added this morning:

In the first, $a starts out a Vector and ends up a Num. That's why the second code dies -- $a is declared a Vector.

That, in turn, means that you can declare the internals of a Vector "ro" and still use += on Vector variables. Each Vector is immutable, but a new Vector is created and assigned to the left-hand side.

This is great for obeying the LSP. For instance, if I define a UnitVector to be a Vector where { $v.Length == 1 }, then I can pass a UnitVector to a function written only knowing about the Vector class and it will automatically do the right thing. This is very different from C++, where passing a UnitVector to an algorithm expecting Vector can muddle up the type system.

The += operator itself provides a prime example of this. The += operator I've already (implicitly!) defined for Vector will work perfectly if passed a UnitVector; it will simply convert the first argument from UnitVector to Vector and get on with life. (I suppose if you have explicitly declared the first argument to be of type UnitVector it will signal an error.) In C++, if you didn't overload += especially for UnitVectors it would break the type invariant for the UnitVector.

One final bit of Perl 6 operator magic:

This defines a new parenthetical operator to calculate the length of a Vector, mimicking the standard mathematical notation. Not much more to say about that than "Awesome!" (Well, I haven't figured out a Texas equivalent yet -- but I've not worried because there is also the Length method.)

Also, I've figured out how to easily handle the Unicode dot and cross operators in TextMate. I programmed tab triggers for each (in the Perl environment) so that if you type, say, "dot" and hit tab, it substitutes the Unicode symbol for the word. It is the best of both worlds: easy to type and still pretty.

Monday, August 17, 2009

Vector: Str and perl

Str and perl are two methods that I had to poke around a bit to figure out that I needed them, and how to define them, but are perfectly simple when defined. I started simply looking for how to overload say for Vector. I'd poked around for this a bit when I realized it was a perfectly stupid idea. I mean, that might be great for saying a single Vector, but it would be terrible for something like say "$c is the cross product of $a and $b".

Clearly, what I really wanted to do was overload ~ for Vector. I don't recall now whether I found what I was looking for in the specs or the Setting. Either way, here is the simple way to do this in the Vector class definition.

I don't quite understand the why our is needed here. I'm guessing it has something to do with explicitly declaring the return type (Str). Past that, the code is perfectly straightforward and elegant.

I spent some time thinking about making a new method that took a string representation of a Vector and parsed it, sort of a reverse of the Str method. But it was clear it could get very tricky with more complicated vectors -- for instance, a Vector of Vectors.

Then I realized that Perl 6 had a mechanism for outputting objects in a fashion they could be eval'ed in again: perl. Unfortunately, the default perl function just returns "Vector.new()" when called on a Vector object. According to #perl, this is intended to automatically do the right thing for cases like this sometime in the future. In the mean time, it is easily overloaded to something that works using the above code.

Saturday, August 15, 2009

Vector: Cross Product

Keeping with my theme of naming operators using Unicode math symbols not in ASCII, my next attempt was the cross product. Of all the operators, this is the one that is not trivial in Perl 6, through no fault of Perl's -- it's a twisty operation. Initially I only had an implementation for 3D vectors, too, with no idea how to do the equivalent in other dimensions. A google search turned up the fact that cross product is only defined for 3D and 7D, with a huge formula for the later. Luckily the Wikipedia page source for the latter had the formula in TeX. It was a small matter of Perl 6 to convert this formula to code I could use.

I ended up using where to make dispatch to the correct version of cross product (or non at all for most dimensions).

Having the 7D cross product is a bit silly -- but it feels like a very nice use of where. I think I will soon go back through most of the other operators and add where clauses to the second parameter, to make sure it has the same dimension as the first.

Of course, I don't have any idea what the 7D solutions should look like. Luckily, I have now have a test script, so I can write tests to make sure that the results of the 7D cross product obey the proper identities:

General: I forgot to mention last time that pmichaud is looking at fixing the bug that stops you from building a more complex version of an operator out of simpler versions of the same operator. When that gets fixed, the vector addition and subtraction and vector/scalar multiplication and division will look a lot nicer.

Friday, August 14, 2009

Vector: New

This should be the post in this series which most reveals how much I don't know. (At least, I hope there aren't any worse than this!) Because the truth is, I don't really understand new in Perl 6 yet, and I completely botched the one member variable declaration. And yet the code all works, as far as I can tell.

So, coordinates is supposed to be an array of something number-like. Why the heck is it declared as a scalar? Damned if I know. In my head it certainly was. I didn't realize that it wasn't until I noticed when I was working on the last blog post. Everything works just fine, presumably because it is getting treated as a scalar containing an Array object, or something like that. Perl 6 just automatically did what I wanted, even though I'd failed to properly communication my wishes. I suspect it might have some impact if/when I try to make things ro, but for now I think I will leave it, at least until I have a real set of tests up and running.

As for the name, I've been pondering shortening it to coords. I think that would be just as clear, and make the code easier to understand just by getting rid of a lot of excess verbiage. This is particularly true for the cross product definitions. I love that the member name has to be tagged with a dot -- it seems a straightforward way of identifying the name as a member variable without adding some sort of tag on the name itself. (That is to same, it's not p_coordinates or m_coordinates or something like that which is a local naming idiom; it's a proper part of the language.)

Moving on, what the new code does is pretty opaque to me. I understand the function signature and how I am establishing a value for $.coordinates. But the $self and bless stuff is just boilerplate copied from the spec. I'm sure there is a rhyme and a reason to it, but it just feels like weird things left over from the Perl 5 way of doing classes.

My new functions here have evolved along the following lines. At first, I didn't have one at all, just using the default generated by Perl 6. I quickly decided that using that was ugly and awkward, so I defined a new which took three Nums. That was fine for implementing dot product and cross product and some simple tests, but when it came time to implement the + operator, I realized everything would be easier if there was a new which took an entire array of coordinates, and so added one.

Yesterday I realized that every operation was defined for any dimension except for cross product, which was only defined for 3D vectors. I did a google search and turned up a definition for 7D cross product. (More on that in a future post.) In the process of working on that, I figured I should define a new which took any number of scalar parameters and made a vector of that dimension. Luckily, Perl 6 defines slurpy array parameters, which take a variable number of scalars and give them to you as an array. Perfect for what I needed. This took the place of the 3D constructor, and I got rid of the Num limitation at the same time.

I've been toying with the idea of a constructor that takes a single Str and tries to parse it, with the idea that would be handy for unserialization. But on the other hand, it seems like the easiest way to serialize Perl 6 objects is to use the perl method. I haven't sorted this out yet in my head, so I haven't done anything with it.

General: I keep updating the Vector repo as I go. It's actually a few days ahead of where I've gotten with the blog posts. All comments and suggestions are welcome. I'm currently thinking I will try for posts on the cross product, the stringify method, testing it, and the overall philosophy of how this class might be used. And I'm thinking that if I have time, I may go on to try using it to define a non-uniform B-spline class...

Wednesday, August 12, 2009

Vector: Dot Product

Let me get right to the first exciting portion of the Vector class. In order of creation, that is. I jumped at the chance to define a proper dot product operator using Unicode. Here is the result:

So: defining an operator is easy in Perl 6, and you can grab a new Unicode symbol for the operator. Dot product is just the sum of the pairwise product of the vectors' coordinates -- that's trivial to express using the hyper multiply and sum operators.

Let's step back and think about that for a moment. Except for the grouping parentheses, and final semicolon, there is nothing extraneous in the function body definition. It exactly implements the mathematical definition of dot product in code. Yet for about the same conceptual complexity as a strictly 3D, strictly working on doubles implementation in C++ or C#, the Perl 6 code is supremely flexible. It works on Vectors of any dimension. And it works on any pair of data types which, when multiplied, yield something that can be added. I believe it also implies that there is no significance to the ordering of the multiplications, meaning that a smart future Perl 6 can automatically parallelize them if the Vector is of large enough dimension to make it worthwhile.

As you can see from the code above, I have chosen to create a "Texas" version of the operator as well, spelling out "dot", for people who may have issues with Unicode. It is my strong belief that this should always be done. I have a great deal of sympathy for people whose favorite editor doesn't handle Unicode gracefully -- that was me until last fall. Even now, it's still easier to just type straightforward ASCII. I implemented the "dot" operator to call the dot operator following the DRY principle, even though it is probably less efficient than repeating the first function body again. (Is there a way to just alias a function/operator name to another operator?)

Let me go through my other stylistic decisions. I original wrote the body with a return statement. That matches my C++ experience, of course, but it is strictly optional in Perl 6. After thinking about it some (including how frequently I write closures where a return would seem bizarre), I decided that for one line functions like this, I would leave the return statement off. I think it is significantly more elegant this way.

I also initially left the "multi" off, just because I didn't realize it was ever needed. As I understand it, if I leave it off, I am defining THE dot product operator -- the only one it would try to call for any data type. That just plain seems rude to me. If nothing else, consider what would happen if code had to glue together two different vector types. (That may sound unlikely to you, but my work code frequently has to glue together three different vector types!) I can't think of a reason why library API functions would ever be anything but multi.

I am a bit bemused that I went with $a and $b for the parameter names. In C++ I would traditional have chosen lhs and rhs, and my C# vector uses v1 and v2. But $a and $b feels Perlish to me -- think of your classic sort code block.

A word on error handling -- obviously there isn't any! If there is some sort of type error, Perl 6 will just report it when it happens. For lots of things, it will just DWIM no matter what strange types are thrown at it. The only error that worries me is if the Vectors do not have the same dimension. In that case the mathematical dot product is undefined, and Perl 6 will just merrily make up numbers to make the dimensions even. (I believe it will extend the array by repeating the last element in it.) I think that's fine for a toy system, but for a production system I'd want to signal an error in that case. I suspect the best way to do this is to add a where clause to the signature that checks that the second Vector has the same dimension as the first...

Perl 6 Class: Vector

A few days ago I set out to create my first real Perl 6 class. Actually, that's not strictly accurate: it was actually my first real Perl class for any variety of Perl. I have a long background in C++ OO, but Perl 5's object syntax always scared me off. (Yes, I know there is Moose these days, but I didn't really learn about Moose's existence until after I was sold on Perl 6.)

At any rate, most of the Perl 6 class examples I have seen are actually class hierarchy examples: all that nonsense about how dogs, trees, and ducks fit together. My first attempt, on the other hand, is a pure math class, Vector, intended to be a useful component in building up more complex geometry objects rather than be something to be derived from.

I've put what I have up now at github. (Geez, I just typed "gitbug" -- hope that wasn't a significant Freudian slip!) I'll try to talk about it in detail over the next few days (and to keep on improving it, too). For now, let me give my quick impressions of working on a numeric class in Perl 6.

Most importantly: I love this! Except for new, the syntax is perfectly natural for me. (And new is easy enough to get by with.) Hyperoperators made it easier to define operators that work on Vectors of arbitrary dimension it is to do strictly 3D vectors in C++ or C#. Unicode lets me define operators that have always had to be functions in C++. Perl's flexibility means Vectors can be defined over any class with vaguely numeric properties, yet it's easier than even a non-templated version would be in C++. Etc, etc.

The only drawback I have found so far is a significant Rakudobug -- you cannot define an operator which uses any version of that operator internally. So you cannot define operator + on Vector in terms of operator + on the Vector's coordinates. This is really a significant drawback for doing this sort of thing; I have worked around it by coming up with fanciful Unicode characters instead of + and -.

But except for that bug, working with this class was a dream. Everything is easy and straightforward, and amazing things can be accomplished with very little work.

Monday, August 10, 2009

Rakudo to the rescue!

I've had a lot of Rakudo-frustration due to its slowness this week, so it was nice to find an application where Rakudo was well-suited already. I just wrote a script for checking the validity of one of my software releases... more than fifty files uploaded (lots of different configurations), each encoded with its own password. It was easy enough to write a Perl 6 script to scan the batch file used to create the release to get the filenames and passwords, then download each file using curl and use unzip to test its validity. And the download times dominant the running time of the script, so Perl 5 wouldn't have been appreciably faster.

qqx is very nice for running external programs and capturing the results. I'm still in love with using when in for loops -- I realize basically the same thing could be done even in Perl 5.8 using if, but somehow when just feels right to me. (Oh wait -- the difference is that after when triggers, it goes to the next iteration of the loop, right? Maybe that's why it seems so nice.)

The only real trip-up I had writing this script was once again forgetting that whitespace is ignored in Perl 6 regexs. I had several minutes of panic, thinking that somehow I wasn't getting the correct result from qqx, or that something special needed to be done to match against strings with embedded newlines. Then I realized I just needed to add :s to both regexes. I understand the logic of doing it that way, but I know I will trip up on that more times in the future...

Thursday, August 6, 2009

Padre on Windows

So, after my fail getting Padre up and running on my MacBook Pro, I figured the least I could do was take a few minutes to try installing the new Almost Six package on my 64-bit Vista box.

Quickly: Install was straightforward and fast. It created a Padre folder in the Start menu. Padre started smoothly. I quickly loaded one of my Perl 6 scripts. After fiddling around a minute, I got the Perl 6 plug-in activated, and the syntax highlighting looks good. Using the Run menu constantly ran it against Perl 5.10 -- I assume that means Strawberry Perl is now installed on my machine, as I'm pretty sure my ActiveState Perl was at 5.8.x. I wasn't able to figure out how to run my script in Rakudo from Padre. But I assume that's just a matter of playing around a few more minutes to get it working.

Stumbling with Rakudo

As has become my habit, yesterday when I needed a small Perl script for work, I tried coding it in Perl 6. IMO, the resulting script is beautiful. I think it's more a shifting of what I consider the common idiom than an actual difference in the languages, but using when instead of the single-line modifier form of if really makes me happy. (Yes, I know when is available in 5.10, but that's not the idiom I'm comfortable using in Perl 5.)

There's only one problem: this isn't actually usable in Rakudo yet. I might have the $*IN part incorrect, but I changed it to use a filehandle for input and print out each line of the file it read during the for loop. And when I ran it on the 10 meg data file I needed to analyze, it just sat there. I believe it was probably trying to read the entire file before doing anything with it, as laziness isn't implemented yet. Between that and Rakudo's lack of speed, the script was a no-go, and I ended up rewriting the script in Perl 5.

PS Before actually posting this, I actually ran the Perl 6 version to completion and timed it. 88 minutes. That's versus less than a second for the Perl 5 version. In fact, I actually wrote and executed the Perl 5 version in considerably less time than it would have taken to just run the Perl 6 version. Also (as sharp eyes may have noticed), I forgot that spaces in regexes are ignored in Perl 6, so the Perl 6 version was in fact incorrect. Getting a working version would have required at least another 88 minutes of sitting there....

Wednesday, August 5, 2009

Project Euler #17, Take Three

So, this version starts with the last version and applies a bit of smarts. Essentially, I said to myself, "Hey! The number of letters in the numbers 100 through 199 are the same as the number of letters in 1 through 99 plus 100 times the number of letters in "one hundred". I reworked the code to take advantage of that. Then I realized that the NumberLength function only needed to work on numbers between 1 and 99 inclusive, and reworked that as a special case. And as soon as I did that, it was obvious that NumberLength only had two cases, one for numbers between 1 and 20, one for 21 to 99, and just made two loops each of which had the appropriate piece of NumberLength code inside.

The resulting code runs in just a shade under two seconds on my MBP. That's a solid 90x improvement over my initial code. But Rakudo's overhead for starting the script is over one second. If you subtract that from all runs, this current version is under a second, and very close to be 200x faster than the original script.

Tuesday, August 4, 2009

Project Euler #17, Take Two

So, here's the completed version of the changes I made using the script from my last post.

It works directly with the number string length rather than the strings themselves, and optimizes the inner loop to only scan the list of numbers with known names once for each call to NumberLength.

The resulting script executes in ten seconds on my MBP, making it about 18 times faster than my first attempt.

I think that's about as far as it is reasonable to take this approach to solving the problem. But that's because this approach, when you get right down to it, is a brute force approach. I suspect there is another huge performance increase to be had by being smarter about the problem. But that is for another day...

Monday, August 3, 2009

Project Euler #17, Optimization Script

So, as I said in my last post, one of the first obvious optimizations to make is to do away with the number name strings altogether, and just count the number of letters in each number name directly. Being lazy, I didn't want to go through my first script and do that by hand -- time-consuming and error-prone. So I wrote a new Perl 6 script to do it for me:

Note that the central loop is just a big serious of when statements. I suspect quite a few of my old Perl 5 scripts have inner loops which would be most naturally expressed this way. The first when is just to automatically pass on lines with comments, none of which I wanted to mess with. The second just transforms lines with strings by applying the letter counting subst from the previous script. The third just copies anything else over.

This doesn't result in a working Euler script -- I still needed to go through and change it from working with the now missing strings to the new numbers. But it did all the grunt work for the change.

This makes the Euler #17 code run about twice as fast. Next up is getting that sort statement out of the inner loop...

Sunday, August 2, 2009

Project Euler #17

Euler Project #17: "If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 + 3 + 5 + 4 + 4 = 19 letters used in total.

"If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?"

Their notes say something about "and" in numbers British-style, but I have ignored this as I have no idea where the ands would go.

This is a slow, stupid way to solve this problem. (Though I'm reasonably happy with it as an example of Perl 6 programming.) On the other hand, this way of doing it makes it much easier to check for correctness. (Though I discovered I'd misspelled one of the numbers in the process of posting this. Sigh.) This runs in just over 3 minutes on my MBP. I will be very disappointed if I can't get that time below 20 seconds. But that is a matter for a future post.

PS I would really like to use Yuval Kogman's system for cleaning up / speeding up Gist embedding in blog posts. Unfortunately, I can't quite make sense of what he is actually doing. Afraid I have fallen a bit behind the technological curve here...