Wednesday, December 16, 2009

Regexes are Wow

I've just spent the last little bit playing around with adding more regexes, and I think I'm in love. It feels both easier to use than Perl 5 regular expressions and vastly more powerful. Here's the latest batch I just added:

regex abc_basenote { <[a..g]+[A..G]> }
regex abc_octave { \'+ | \,+ }
regex abc_accidental { '^' | '^^' | '_' | '__' | '=' }
regex abc_pitch { <abc_accidental>? <abc_basenote> <abc_octave>? }

regex abc_tie { '-' }
regex abc_note_length { [\d* ['/' \d*] ] | '/' }
regex abc_note { <abc_pitch> <abc_note_length>? <abc_tie>? }

The only difficulty I had here was not realizing at first that abc_basenote needed angle brackets around the character class specification. That caused me a few minutes of confusion, because I was trying to jump to (informally) testing more complex things built up around it, and so it wasn't immediately obvious that the base note regex was simply not working.

But hey, I know how that works now, and wow! The rest of it was dead easy and expressive to boot. And using .perl to dump the match structure is awesome.

There are two things to ponder here, however. First, having abc_ at the beginning of every regex is an obvious wart. I think I can get around that by putting these in a grammar? Must investigate.

The second is that at this point, I've got enough structure to what I'm doing that I really need a test suite. I suppose if I package the regexes in a grammar, then I can use it from another file, and test that? Must experiment.


  1. Yes, you should put them in a grammar. Yes, you should write tests. (Preferably first, but it's up to you.) Yes, you'll have an easier time testing the regexes if you package things in a grammar.

    By the way, if regexes and grammars interest you, I hope you'll like my blog post series "The 7 wonders of the ancient grammar engine", which I'll be starting to post soonish.

  2. But without knowing anything about how grammars work, it's impossible to formulate tests!

    I've done it in the opposite order; I've got a simple grammar up and running now, and will start adding tests next time I get tuits to work on the project.

    Though any hints you have on testing grammars would certainly be welcome...

  3. masak: I've pushed the first simple tests to github. If you have any suggestions on how to improve these, I'd love to know. Testing matches seems infinitely more clunky than the matches themselves...