Paul Sandoz wrote:
> There are defintely some bugs (well missing features really) with
> respect to the Encoder class because it does not support full UTF-8
> encoding of all possible code points (especially high and low
> surrogates). It may be best to use NIO here or we can copy the code from
> FI which i optimized specifically for UTF-8, i can fix this if you want.
Ah, that's right. Surrogates.
I used to think that I'm pretty familiar with all those nitty gritty
details about encoding, charset, and all that stuff. And now look at me...
I thought the encoding code is one of the hotspots, so I assumed
inlining them manually would be worthwhile (as opposed to use NIO
encoder.) JIBX also had the similar code inlined, so that was also a
motivation.
If there's something you can copy very quickly, that would be great.
Otherwise I can fix it by myself.
--
Kohsuke Kawaguchi
Sun Microsystems kohsuke.kawaguchi_at_sun.com