Oniguruma is a C-based regular expression engine starting to get some
attention. The key selling points are its speed and the fact that it can
be applied to string content with arbitrary encodings. It will be the
default regex engine in Ruby 1.9.
JRuby 1.1 will ship with a port of Oniguruma dubbed "Joni". For us, the
benefit is that we'll finally have a fast regex engine that can work
with Ruby's encoding-free byte[]-based strings, where before we had to
convert to/from char[] for all regex engines. We expect to see great
gains in regex performance with JRuby 1.1 when we release the final
version in Decemberish timeframe.
But it has occurred to me there could be an even more interesting use of
Joni: as a regexp engine that could accept NIO bytebuffers directly.
Because it just walks byte[], no decoding is necessary. Because it's
encoding-agnostic, any arbitrary byte content could be matched. So in
theory it could easily be adapted to be a fast NIO bytebuffer regex engine.
Would there be interest in such a thing? I'm sure there are other
NIO-related lists that would be appropriate, but Grizzly is the first
actual project that springs to mind when I think of NIO, so I thought
I'd toss it out there.
- Charlie