Thursday, October 29, 2009

It's Alive!

Following the #perl6 IRC channel today, I couldn't help but get the feeling that things are really bubbling up in the world of Rakudo and Perl 6. masak started things off by posting a link to one of his older blog posts inspired by Knuth. Later he promised a month of blog posts on a secret project.

Meanwhile jnthn added nested signatures to Rakudo, producing this beautiful implementation of Quicksort. moritz posted on recent progress -- the last two months have been very productive! He also started another chapter in the Perl 6 book.

All of this is accompanied by a near constant background drone of pmichaud checking in changes to his nqp-rx project. I must admit I don't understand all the details of what he is doing, but I believe the gist of it is replacing a key component of Rakudo with a new, considerably smarter version -- in particular, one capable of handling the STD.pm grammar that Larry Wall has been working on. It's shocking how much progress pmichaud is making every day. It sounds like it will go into Rakudo sometime in November, which is terrific news, as there have many things blocking on Rakudo's current parser.

(Meanwhile, of course, I haven't found time to work on my own humble posts. I've got some code, but I'm pretty sure there's something I'm doing wrong, and I can't test it because another bit relies on a NYI feature. Sigh...)

P.S. While I was writing this post, carlin posted a link to a new bot written in Perl 6. Fun times!

Monday, October 26, 2009

Spelling Corrector Thoughts

Last night while I couldn't sleep I was pondering the spelling corrector code. One thing I thought was an advantage over the Python version is that my Perl 6 version used <alpha> to match words, rather than [a-z]. My understanding (though I'm having a hard time finding it in the spec at the moment) is that <alpha> is Unicode-aware, which seems like a big win in terms of power.

Unfortunately, my code, following Norvig's version, uses just a to z when it wants to substitute or add an arbitrary letter. I have no idea how to do a Unicode equivalent of that -- and even if I did, the combinatorial problem which already cripples Rakudo's ability to handle the two-edit case would just get horribly worse. How to handle that mismatch, I wondered?

That's when it occurred to me: instead of constructing a candidate list of psuedo-words that might match real words in our dictionary, construct a psuedo-word regular expression and grep it against the list of words in the dictionary. I'm guessing it will be slower, but it will be every bit as Unicode-aware as <alpha> is, and I'm pretty sure it can be done with only a few more lines of code.

Okay, now that I've thought of it, it's time to actually try to write it...

Friday, October 23, 2009

Spelling Corrector

I keep on seeing these marvelous Python scripts by Peter Norvig, which always make me want to translate them to Perl 6. Today it was his spelling corrector in 21 lines of code. It's a lovely piece of code; mine is a fairly direct Perl 6 version. I fudged a little by dropping functions that seemed fairly pointless, and I don't recommend uncommenting the second @candidates = line unless you have a near infinite amount of patience. I haven't tested it exhaustively, but it does work for my simple tests.

Even with two closing braces getting a line of their own, it's shorter than the Python version. My feeling is it is uglier than his version. On the other hand, I'm sure it's not the best possible Perl 6 version. I encourage people to take a stab at improving it!

It feels to me like this code would be much better if it directly implemented the modifications to the string using left-hand side substr rather than Norvig's trick of splitting the word and working with the split versions. Unfortunately as far as I can tell this is NYI in Rakudo. Also substr seems awkward next to the Python equivalent using [x:y].

I felt the interactions between the functional style code and the hash code were uncomfortable. I suspect there is a Perl 6 idiom here that I just don't know yet; the language has to be more elegant than my clumsy attempts here. I'm not wildly happy about the entire correct function, either.

So, does anyone have suggested improvements?

Tuesday, October 20, 2009

Polynomial & Vector: Together At Last

After a weekend spent studiously ignoring Perl 6, I've finally dug back in and sorted out the problems in the KnotVector / Nubs implementation. At least, every test is back to passing, and I've added a number of harder tests which pass as well. The good news is I like the code I have now a lot better than the code I thought I was ready to post last time around. (Still needs some work, though, IMO.)

Without further ado, here's the heart of what I've been trying to get at.

multi method Evaluate($t, KnotBasisDirection $direction = Left)
{
my $n0 = $.knot_vector.N0_index($.degree, $t, $direction);
return [+] ($.knot_vector.N_local($n0, $.degree, $t)
>>*<< @.control_points[$n0 .. ($n0 + $.degree)]);
}

multi method Evaluate($base_t, $actual_t, KnotBasisDirection $direction = Left)
{
my $n0 = $.knot_vector.N0_index($.degree, $base_t, $direction);
return [+] ($.knot_vector.N_local($n0, $.degree, $actual_t)
>>*<< @.control_points[$n0 .. ($n0 + $.degree)]);
}

Notice these two routines are exactly the same, except where the second uses $base_t and $actual_t, the first uses just $t both times. This allows us to always set $base_t with an actual number (so that it can be used to calculate $n0), but pass anything that logically works for $actual_t. If $actual_t is a number, then Evaluate will calculate the curve's value at that parameter value. If it is a Polynomial representing a single variable (call it what you will, I keep switching between u, t, and x in my thinking), then Evaluate will return the Polynomial representing the section of the curve specified by $base_t. If (as I keep toying around with), you passed a class representing a closure you can do math on, then you'll get a fancy closure that can evaluate that section of the curve. And so on and so forth. It's not tied to a particular implementation of Polynomial; all that's required is something that can handle a few basic math functions.

Friday, October 16, 2009

Humbled by (Poor) Testing

I thought I was all ready to finish up with the Nubs / Polynomial thing last night. I started a blog post, then I looked at the code I wanted to show off. Well, it didn't seem clear enough to me to make me happy. Wait, I said to myself, this would make more sense as a Nubs member function. And that rewrite was so nice, I told myself it would be even better if I rewrote Nubs.Evaluate in the same fashion, because then the real beauty of this approach would shine out.

And then the excrement hit the fan, because after that perfectly straightforward rewrite, tests started failing left and right, with really weird errors. It took me about a half hour to figure out that this wasn't some strange Rakudo issue, it was a basic failure of my algorithm. For several weeks now, the code had been broken. But my tests were too clumsy to detect the failures until I stretched what I was doing really far out there.

Basically, the testing issue was this. The basis vector for the very first point of the NUBS curve should look like (1 0 0 0 ...). That means the very first point is just the very first control point. However, if in all your tests that first control point is some variant of (0, 0, 0, ...) (ie the origin), then a result of (0, 0, 0, ...) for the first point only shows you that the basis vector looks like (x 0 0 0 ...), where x can be anything at all. Whoops.

So it's back to the drawing board. Clearly I need more tests, and then I need to puzzle out where my algorithms have gone wrong. At least I'm pretty confident that I've got the Perl 6 issues worked out; now it's just a matter of fixing the math.

Monday, October 12, 2009

Insanity of Anti-If Campaign

I noticed the Anti-If Campaign on Hacker News this morning. This is other one of my pet peeves about received OO wisdom. Essentially, a lot of these people take the extremely useful virtual function approach to handling object logic and try to stretch it far too far.

So on the one hand, we have something like Perl 6's .Str function. Trying to implement that any way other than a virtual method calling mechanism would be insanity, because you want code to be able to call that function from anywhere and get the proper result no matter what kind of object you are calling it on. You want to be able to say "The answer is $a" without using some obscene string of ifs based on the type of $a. That's a no brainer.

But on the other hand, consider converting a surface object from a nice surface class hierarchy into another surface format which the original class hierarchy can't know anything about. "Aha," the anti-if zealot says, "an opportunity for the Visitor pattern!"

I went through a phase of heavy Visitor pattern usage back around 1997 or so. And in retrospect, I'm pretty sure nine out of ten times I used it, the code would have been better had I just used an if statement; frequently much better.

Here's a quick example: suppose I'm processing a $surface, and I'd like to use exact calculations if it's a (dead simple) plane surface, and approximate calculations otherwise. So I'm looking to set $use_exact_calculations as appropriate. Here's my best attempt (untested!) to capture the Visitor pattern in Perl 6:

If you think this is wordy, you should see the C++ version! Perl 6 is very good at this sort of thing. But still you're creating an entire class just for one usage, and in the process moving critical logic away from the place the logic is actually used.

But would you really rather have that in your code just so you can get rid of this:

Okay, technically that doesn't actually have an if statement, but it does have the dreaded direct examination of the type of $surface, which is what they would like to eliminate the ifs to avoid.

Okay, now. Does anyone seriously think the former code is better than the latter?

Polynomial & Vector: Debugging Update

I've tracked down the next bug mentioned at the end of my last post, and again it comes as a side effect of the hack to make it easier to code Polynomial addition. In that hack, we extend the Polynomial coefficients array with 0. But if the coefficients are Vectors, the actual extension value should be Vector.new(0, 0, 0). Just plain 0 means we eventually try to add 0 to a Vector, and since Vector doesn't support addition with a scalar, it falls back to a normal addition operator, which tries to convert Vector to a Num.

Thoughts: I don't see any obvious way to figure out the correct "zero" to extend the coefficients with. Nor do I see any obvious way to allow you to add 0 to a Vector. (At least without opening up general vector plus scalar math, which is a bad idea IMO.)

It also suggests that it may be a very good idea to go beyond implementing the operators that work to implement dying versions of the operators that shouldn't be allowed, because allowing Perl 6 to fall back to its own operators will produce less useful error messages. (Or worse, they might even "work".)

I can see one more potential error like this lurking in the Polynomial.prefix<-> code...

Sunday, October 11, 2009

Polynomial & Vector: Debugging

So, I was just going to post on working around the bug mentioned in my last post, when I noticed Moritz++ had commented with a suggested debugging approach. And we're off and running!

multi method Num()
{
die "Cannot call Num on Vector!";
}

It's not the most elegant error message, but it will do. Adding that switches from the fairly useless Method 'Num' not found for invocant of class 'Vector'
in Main (file src/gen_setting.pm, line 206)
message to the extremely interesting

Cannot call Num on Vector!
in method Vector::Num (file lib/Vector.pm, line 24)
called from method Polynomial::new (file , line )
called from sub infix:* (file lib/Polynomial.pm, line 106)
called from Main (file , line )

Aha! Not a Rakudo bug at all, this is a legit issue. When I added the fix to remove trailing zero coefficients, I wrote @x[*-1] == 0 to check for them. Odds seem very good that == calls .Num. So... a change to .abs here is probably more realistic, anyway. (Maybe? I need to ponder that.)

Hmmm... that moves the error elsewhere, to (according to the backtrace) infix:+ (file , line ). I need to go to bed now, so the final debugging will have to wait. But at a minimum, Moritz's suggestion of adding a .Num that dies appears to be a valuable Perl 6 debugging trick.

Friday, October 9, 2009

Polynomial & Vector: A Fly in the Ointment

This evening I was building my test case to make sure the Polynomial I am generating from the KnotVector basis actually yields the same results as the standard Nubs. Given an array of polynomials and an array of vectors, I wanted to multiply them with each other and sum the bunch. I wrote my $poly = [+] (@polys >>*<< @control_points);. When I ran it, I got back:

Ambiguous dispatch to multi 'infix:*'. Ambiguous candidates had signatures:
:(Polynomial $a, Any $b)
:(Any $a, Vector $b)

That's a beautiful error message, and completely appropriate. Both functions could do the job, each giving you their own answer. That is, (Polynomial $a, Any $b) where Any is a Vector would generate a Polynomial with Vector coefficients; (Any $a, Vector $b) where Any is a Polynomial would generate a Vector with Polynomial values. Both make sense, but for our purposes the former is probably best.

How to make that happen? The Polynomial and Vector classes are both completely independent of each other. Other than this potential intersection, they don't need to know about each other. After toying around a bit, I believe "is default" on the Polynomial multiply is the way to go.

Unfortunately, that gets me the error Method 'Num' not found for invocant of class 'Vector'. Possibly the problem is nested hyper multiply operators? Even when I map that operation, there are still two nested hyper multiply operators (both Polynomial and Vector multiply have them). Yet once again beautiful Perl 6 runs up against the awkward (and ever-improving) reality of Rakudo.

Wednesday, October 7, 2009

Testing and Solutions

Turns out submitting the bug I had that was messing up the search test was the right thing to do, as Moritz++ quickly pointed out it wasn't a bug at all. [<=] is a listop, so where I thought I was sending two arguments to ok, [<=] @array and "array is sorted properly", actually I was sending just one, the equivalent of [<=] @array, "array is sorted properly", and it is certainly not true that 8 <= "array is sorted properly". Problem solved!

Meanwhile, Hacker News pointed me to a very nice article by Peter Seibel on TDD, "Unit testing in Coders at Work". What's great is that it points out both cases where unit tests were absolutely crucial AND it points out great work done without them.

It also points out a hilarious example where Ron Jeffries goes completely off the rails trying to write a Suduko solver using TDD. Peter Seibel even coins a name for this "pattern": "Going in Circles Means You Don’t Know What You’re Doing".

Seriously, it's worth looking at Peter Norvig's solver to get an idea what a working Suduko solver looks like, and then check out those five posts of Jeffries'. I don't think I can say it better than Norvig did: "I think test-driven design is great. I do that a lot more than I used to do. But you can test all you want and if you don’t know how to approach the problem, you’re not going to get a solution."

BTW, Norvig's approach looks like it would be an absolute blast to code in Perl 6. If no one else has done it, I may take a stab once I've gotten a bit further on my current project...

Tuesday, October 6, 2009

Vector: Quick update

I meant to post on my polynomial trick, which now works, but instead got into a long run of small changes this evening. So let me just say that abs has been moved to the setting, so tonight I wrote Vector.abs, which makes Test.pm's is_approx work to compare two Vectors. It feels like Perl 6 is becoming more useful every day.

I also put in a hack workaround for the rakudobug (reported this evening) turned up in t/05-search.t, which means all the Vector tests pass again.

I'm also starting to ponder using SVG to output some NURBS (well, NUBS for now) curves to provide a graphic illustration of what my new code here is doing...

Sunday, October 4, 2009

Testing

I intended to talk about testing a lot more on this blog when I started out, before I got thoroughly sidetracked by the coolness of Perl 6. However, seeing another furor about it on the various programming sites last week, and having varied experiences with it in my own work in the same time, made me think it was time to revisit it.

First the good. A lot of the Perl 6 work I've been doing is perfectly suited to TDD. The tests are easy to write and nicely concise, with the median test just a single line long, and the longest maybe six lines. The tests run pretty quickly, even as slow as Rakudo is today. They provide direct and useful feedback about the code, and make it easy to make larger changes to the code with the confidence that the tests will find any problems you create that way.

I guess even here I wouldn't be following the rigorous ways of TDD. Typically I would code a little bit of library first, to get a feel for where it is going, then write tests for what I have done, and more tests that occur to me. Then make those tests work, rinse, and repeat. Sometimes the code would lead the test, sometimes the other way around.

But as I said, I find this really effective for this sort of work.

My problem with advocates of TDD, then, is that a lot of them seem to imagine that this sort of work is the only wort of work. But it isn't! Testing to see that two vectors are approximately equal (within a tolerance) is trivial. Testing to see that two B-reps are approximately equal (within a tolerance) is monstrously hard. Seriously, I've been doing professional work with B-reps for fifteen years now, and I have no idea how I would practically go about such a thing. (If I had code to fill the B-rep with cubes of varying sizes, you could then compare the cubes, to make sure that all the cube vertices of one B-rep were inside the cubes of the other, and all the voids were likewise empty. But that's a pair of complicated O(N^3) algorithms, and it still wouldn't handle a vast horde of common cases (like NMT B-reps and open shells).)

Putting this in concrete: last week I was working on a bug involving the orientation of B-rep faces on a simple box model. So I wrote up a test to look them. It took me several hours to write approximately 150 lines of code for the test, and it was fairly hard work. At the end of that, the test ran -- and confirmed that the model was correct as far as it could tell. (Admittedly the tests would have been easier to write if I could have written them in Perl 6 (with a well-written B-rep library) rather than C++. A lot of common B-rep operations are terribly verbose in C++.)

The end result is that I write a good many unit tests, but they are testing around the edges of things. So you can have unit tests for an assembly structure, but the assembly components are straight lines rather than B-reps. The tests can be quite helpful, but they are hardly conclusive proof your code is working.

A third example is a project I considered when I was dreaming of buying an iPhone. Wouldn't it be great, I thought, to have a little button accordion app, so you could fire up an accordion at any moment? The program would obviously consist of two main units: the user interface that allows you to press the buttons, and the music-making engine.

How would you go about unit testing something like that? One of the major components requires having someone pressing buttons on the iPhone to test it properly. The other component is generating audio. You could mock both components, but that would only let you test the interface between the two, but that's the trivial part of the program.

And if the app was good, both components would need to be finely tuned to get the proper feel. It seems to me this would require hours of sitting there playing the thing, and unit testing would help little, if at all...