Wednesday, December 16, 2009

Getting Started with ABC

So, in classic style, I've started off this project by just flailing around a bit. My first thought was to try TDD. But I very quickly realized I didn't have the first clue what the interface being tested should look like. So I dashed off the roughest idea for some classes. But I don't know how they are best filled. So I am reversing course, and will start by playing around with regexes a bit, trying to learn how the new facilities in Perl 6 work.

So my real starting point is pulling up The Book and reading up on regexes. No pretense I understand what's going on, let me just document what happens as I go along. BTW, this project is at github.

So, my first attempt looked like this:
use v6;

my $abc = q¬´X:64
T:Cuckold Come Out o' the Amrey
S:Northumbrian Minstrelsy
A/B/c/A/ +trill+c>d e>deg | GG +trill+B>c d/B/A/G/ B/c/d/B/ |
A/B/c/A/ c>d e>deg | dB/A/ gB +trill+A2 +trill+e2 ::
g>ecg ec e/f/g/e/ | d/c/B/A/ Gd BG B/c/d/B/ |
g/f/e/d/ c/d/e/f/ gc e/f/g/e/ | dB/A/ gB +trill+A2 +trill+e2 :|

regex abc-header-field { ^^ \w ':' .* $$ }
regex abc-header { <abc-header-field>+ }

if $abc ~~ m/ <abc-header-field> /
say $<abc-header-field>;

Note how I'm trying to get with the program and use dashes in my Perl 6 identifiers! Unfortunately, this doesn't actually work in this case:

Confused at line 15, near "abc-header"
in Main (file , line )

I played around a bit, and finally ended up changing the dashes to underscores. Changed that way it runs, but say $<abc_header_field>; prints out the entire string rather than just the first header field I had expected.

Aha! Apparently the .* $$ sequence in abc_header_field doesn't give you every up to the end of the line, as I'd expected. Instead it matches all characters (including newlines) up until it hits the end of the file, then backtracks until it finds the last end-of-line. Good to know. Changing it to \N* $$ works nicely.

Now I tried making it match abc_header instead of abc_header_field. No joy, we still just get the first header line. Why? Because we aren't matching the newlines between header lines! So tweak abc_header to account for the newlines, and bingo!
regex abc_header_field { ^^ \w ':' \N* $$ }
regex abc_header { [<abc_header_field> \n]+ }

if $abc ~~ m/ <abc_header> /
for $<abc_header><abc_header_field> -> $line
say "header: $line";

And here are the new results:

header: X:64
header: T:Cuckold Come Out o' the Amrey
header: S:Northumbrian Minstrelsy
header: M:4/4
header: L:1/8
header: K:D

Wow. After a couple of false starts, I've got some simple, powerful code there. I think I like these regexes...

Updated: Ugh, some weird formatting issues with the Perl 6 syntax highlighter. Too tired to mess around with now. Rest assured that there is a closing curly bracket in each example code.

No comments:

Post a Comment