Wednesday, December 30, 2009

Key signatures

After making my last post, I was brimming over with confidence. Obviously I'd done all the hard bits, and it was just a matter of putting the pieces together, right?

Wrong. Perl 6 makes parsing the ABC data so darned easy that the next bit seems completely unreasonable. At least, I haven't figured out a way to make it elegant yet. Figuring out which sharps or flats are in a key signature is tricky stuff, and as far as I can figure, Perl 6 doesn't really have any clever tools to make it easier.
sub key_signature($key_signature_name)
{
my %keys = (
'C' => 0,
'G' => 1,
'D' => 2,
'A' => 3,
'E' => 4,
'B' => 5,
'F#' => 6,
'C#' => 7,
'F' => -1,
'Bb' => -2,
'Eb' => -3,
'Ab' => -4,
'Db' => -5,
'Gb' => -6,
'Cb' => -7
);

my $match = $key_signature_name ~~ m/ <ABC::basenote> ('#' | 'b')? \h* (\w*) /;
die "Illegal key signature\n" unless $match ~~ Match;
my $lookup = [~] $match<ABC::basenote>.uc, $match[0];
my $sharps = %keys{$lookup};

if ($match[1].defined) {
given ~($match[1]) {
when "" { }
when /^maj/ { }
when /^ion/ { }
when /^mix/ { $sharps -= 1; }
when /^dor/ { $sharps -= 2; }
when /^m/ { $sharps -= 3; }
when /^aeo/ { $sharps -= 3; }
when /^phr/ { $sharps -= 4; }
when /^loc/ { $sharps -= 5; }
when /^lyd/ { $sharps += 1; }
default { die "Unknown mode {$match[1]} requested"; }
}
}

my @sharp_notes = <F C G D A E B>;
my %hash;

given $sharps {
when 1..7 { for ^$sharps -> $i { %hash{@sharp_notes[$i]} = "^"; } }
when -7..-1 { for ^(-$sharps) -> $i { %hash{@sharp_notes[6-$i]} = "_"; } }
}

return %hash;
}

Basically, we do a match to rip the key signature name into its component pieces. The classic Highland piping key of "Amix", for instance, needs to be recognized as "A", no sharp or flat, "mix"olydian. We use the first two bits to lookup the corresponding major key signature, then use the last bit as a modifier. When we're done, we have the number of sharps or (if negative) flats. We then use that count to figure out which notes need to be sharp or flat.

It doesn't sound that bad, but it was pretty tricky to implement. It also (likely) has some holes in it. For example, the ugly key signture of C-flat minor (4 flats and 3 double flats!) will fail. Of course, any sane person would write that as B minor (two sharps). Also, the ABC spec allows you to explicitly specify exceptions to the normal key signature rules. I haven't even tried to implement that yet.

This is definitely one of those cases were any suggested improvements will be very welcome.

Monday, December 28, 2009

Getting all the notes

I hope everyone had lovely holidays! I've spent a bit of time while away fiddling with my idea for a script to determine if a given tune can be played on a D/G button accordion. I've hit on a bit of a snag, as the code to determine what sharps or flats are in a given key signature is more complicated than I was thinking. It will probably take me a few more days to sort that out, but in the meantime, here is my first stab at extracting all the pitches from an ABC tune:
my $match = $abc ~~ m/ <ABC::tune> /;

die "Tune not matched\n" unless $match ~~ Match;

my @notes = gather for $match<ABC::tune><music><line_of_music> -> $line
{
for $line<bar> -> $bar
{
for $bar<element>
{
when .<broken_rhythm> { take .<broken_rhythm><note>[0]; take .<broken_rhythm><note>[1]; }
when .<note> { take .<note>; }
}
}
}

@notes.map({.<pitch>.say});

I'm suspecting there is a lovely Perl 6 idiom I don't quite have yet to squish all the for statements into a series of maps, but this works fine for the time being.

Sunday, December 20, 2009

Got It!

I've now got a simple working ABC tune parser. (Well, working in at least so far as it passed all the tests I've thought up so far.) Once I cracked the whitespace issue, the remaining piece fell into shape quickly. Figuring out that \h and \v were useful for horizontal and vertical whitepsace also helped. Here's the grammar I'm using now:
use v6;

grammar ABC
{
regex header_field_name { \w }
regex header_field_data { \N* }
regex header_field { ^^ <header_field_name> ':' \s* <header_field_data> $$ }
regex header { [<header_field> \v]+ }

regex basenote { <[a..g]+[A..G]> }
regex octave { \'+ | \,+ }
regex accidental { '^' | '^^' | '_' | '__' | '=' }
regex pitch { <accidental>? <basenote> <octave>? }

regex tie { '-' }
regex note_length { [\d* ['/' \d*]? ] | '/' }
regex note { <pitch> <note_length>? <tie>? }

regex rest_type { <[x..z]> }
regex rest { <rest_type> <note_length>? }

regex gracing { '+' <alpha>+ '+' }

regex spacing { \h+ }

regex broken_rhythm_bracket { ['<'+ | '>'+] }
regex broken_rhythm { <note> <g1=gracing>* <broken_rhythm_bracket> <g2=gracing>* <note> }

regex element { <broken_rhythm> | <note> | <rest> | <gracing> | <spacing> }

regex barline { ':|:' | '|:' | '|' | ':|' | '::' }

regex bar { <element>+ <barline>? }

regex line_of_music { <barline>? <bar>+ }

regex music { [<line_of_music> \s*\v?]+ }

regex tune { <header> <music> }
}

This will successfully parse the sample ABC I gave in my first post on this topic.

Where to go from here? 1) I'd like to write a simple script that can put the grammar to work as is. I'm thinking maybe a script which can test if a tune is playable on a D/G accordion. 2) So far, the grammar only supports a subset of ABC. Off the top of my head, it doesn't support chords (either as labels above the music or as more than one notehead at once) and it doesn't support in-line key changes or time changes. I'm guessing these will be fairly easy to add. 3) I'd like to be able to map the parse to actual Perl 6 classes to make the ABCs easier to manipulate. 4) I'd like to be able to use those classes to write out ABCs.

So much to do!

Infinite Loops

Well, I tried reformulating the line of music regex, and have what I think is a nice version, with one major catch. Here's the new regexes:
    regex element { <broken_rhythm> | <note> | <rest> | <gracing> }

regex barline { ':|:' | '|:' | '|' | ':|' | '::' }

regex bar { <element>+ <barline>? }

regex line_of_music { <barline> $$ | <barline>? <element>+ $$ }


As is, this passes without issues:
{
my $match = "g>ecgece/f/g/e/|" ~~ m/ <ABC::bar> /;
isa_ok $match, Match, 'bar recognized';
is $match<ABC::bar>, "g>ecgece/f/g/e/|", "Entire bar was matched";
is $match<ABC::bar><element>.map(~*), "g>e c g e c e/ f/ g/ e/", "Each element was matched";
is $match<ABC::bar><barline>, "|", "Barline was matched";
}

I think that's a suitably elegant test.

There's only one problem here. Real ABC allows for spaces between elements, which makes the ABC format much more legible. (I would like this code in the long run to be able to take an ABC tune, process it, and write it back out again, so ideally the code needs to carefully note the whitespace so it can be output again later.) But when I try to add <ws>+ alternative to the element regex, and try to parse a bar of music with a space or two in it, it appears to just hang.

This one has me utterly stumped; do any of the Perl 6 regex experts out there have a clue what could be going on?

Update: Still unable to make <ws> work for me, but had great luck with just using good old \s. So the code works now, but I'd still love an explanation...

Update to the Update: Argh, bar works but line_of_music still hangs. Dang it, I feel like I'm sooooooo close to getting this thing working....

Update^3: Ah, I simply had a completely broken line_of_music. I've got a simpler one working now.

Friday, December 18, 2009

Moritz++

Going over the #perl6 backlog today, I noticed that Moritz had two suggestions for me based on my last post. The first was a nice suggestions to use given in my tests:
{
my $match = "d'+p+<<<+accent+_B" ~~ m/ <ABC::broken_rhythm> /;
isa_ok $match, Match, '"d+p+<<<+accent+_B" is a broken rhythm';
given $match<ABC::broken_rhythm>
{
is .<note>[0]<pitch><basenote>, "d", 'first note is d';
is .<note>[0]<pitch><octave>, "'", 'first note has an octave tick';
is .<note>[0]<pitch><accidental>, "", 'first note has no accidental';
is .<note>[0]<note_length>, "", 'first note has no length';
is .<g1>[0], "+p+", 'first gracing is +p+';
is .<broken_rhythm_bracket>, "<<<", 'angle is <<<';
is .<g2>[0], "+accent+", 'second gracing is +accent+';
is .<note>[1]<pitch><basenote>, "B", 'second note is B';
is .<note>[1]<pitch><octave>, "", 'second note has no octave';
is .<note>[1]<pitch><accidental>, "_", 'second note is flat';
is .<note>[1]<note_length>, "", 'second note has no length';
}
}

This is far from the amazing new testing form I was hoping to find, but it sure is a big improvement over what I had. If I don't think of something else better, I will go back and redo all of the longer test cases this way.

He also said about the barline regex, "Looks like it would need LTM to work." I believe he's talking about longest-token matching, but I don't fully understand the issues, I fear. I did finally write tests for this case:
for ':|:', '|:', '|', ':|', '::'  
{
my $match = $_ ~~ m/ <ABC::barline> /;
isa_ok $match, Match, "barline $_ recognized";
is $match<ABC::barline>, $_, "barline $_ is correct";

And they didn't work -- sometimes it would match the first barline it recognized rather than the longest. Moritz also suggested, "Reordering the longest alternatives to the front would help," so I did, and then all the tests passed. I hope that means the problems are actually gone, and not just that I've managed to hide them for the moment.

Anyway, big kudos for Moritz! Thanks to him, it definitely feels like I am getting closer here. Now if I can just figure out the proper way to ask for an entire line of ABC music in Perl 6, I will be there!

Thursday, December 17, 2009

Now With Grammar And Tests

I've made a huge amount of progress with the ABC project in the last 36 hours. At this point I think I've just got a few more rules to write and debug before we are able to completely parse the sample ABC tune I posted several days ago. (Naturally they'll be the trickiest, I imagine.)
grammar ABC
{
regex header_field_name { \w }
regex header_field_data { \N* }
regex header_field { ^^ <header_field_name> ':' \s* <header_field_data> $$ }
regex header { [<header_field> \n]+ }

regex basenote { <[a..g]+[A..G]> }
regex octave { \'+ | \,+ }
regex accidental { '^' | '^^' | '_' | '__' | '=' }
regex pitch { <accidental>? <basenote> <octave>? }

regex tie { '-' }
regex note_length { [\d* ['/' \d*]? ] | '/' }
regex note { <pitch> <note_length>? <tie>? }

regex rest_type { <[x..z]> }
regex rest { <rest_type> <note_length>? }

regex gracing { '+' <alpha>+ '+' }

regex broken_rhythm_bracket { ['<'+ | '>'+] }
regex broken_rhythm { <note> <g1=gracing>* <broken_rhythm_bracket> <g2=gracing>* <note> }

regex element { <note> | <broken_rhythm> | <rest> | <gracing> }

regex barline { '|' | ':|' | '|:' | ':|:' | '::' }

regex line_of_music { <barline> | [<barline>? <element>+ [<barline> <element>+]* <barline>?] }
}

Much, much nicer than just having "abc_" at the beginning of every regex name. And wow, compared to any other parsing tool I've ever used, this is really, really easy. This comes very close to matching the ABC BNF, though I've simplified a lot, and changed !trill! to +trill+ (etc) to match the version of ABC present in this file.

So far the only downside I've found is that it is ugly to test:
{
my $match = "d'+p+<<<+accent+_B" ~~ m/ <ABC::broken_rhythm> /;
isa_ok $match, Match, '"d+p+<<<+accent+_B" is a broken rhythm';
is $match<ABC::broken_rhythm><note>[0]<pitch><basenote>, "d", 'first note is d';
is $match<ABC::broken_rhythm><note>[0]<pitch><octave>, "'", 'first note has an octave tick';
is $match<ABC::broken_rhythm><note>[0]<pitch><accidental>, "", 'first note has no accidental';
is $match<ABC::broken_rhythm><note>[0]<note_length>, "", 'first note has no length';
is $match<ABC::broken_rhythm><g1>[0], "+p+", 'first gracing is +p+';
is $match<ABC::broken_rhythm><broken_rhythm_bracket>, "<<<", 'angle is <<<';
is $match<ABC::broken_rhythm><g2>[0], "+accent+", 'second gracing is +accent+';
is $match<ABC::broken_rhythm><note>[1]<pitch><basenote>, "B", 'second note is B';
is $match<ABC::broken_rhythm><note>[1]<pitch><octave>, "", 'second note has no octave';
is $match<ABC::broken_rhythm><note>[1]<pitch><accidental>, "_", 'second note is flat';
is $match<ABC::broken_rhythm><note>[1]<note_length>, "", 'second note has no length';
}

On the plus side, this does show how to get at the parsed bits. On the downside, it's not really good at testing what is not present in the match, and it seems like any refactoring to the grammar will lead to massive changes in the tests. I'm guessing there will be a better way of testing this in the future... or there already is and I just don't know about it.

At this point, it seems to me the biggest obstacle is figuring out how to formulate line_of_music so that it actually returns its results in a usable matter. The thing is, the interleaved order of the barlines and the elements is very important to make sense of the music. The way I'm doing it now will return an array of barlines and an array of elements, with no idea how those two arrays interact....

Ack: Forgot to include mention of the word Perl here so it would get picked up by Ironman.