users@jersey.java.net

Re: [Jersey] FormDataMultiPart UTF-8 issue

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Wed, 20 Jan 2010 12:10:20 +0000

Hi Geoff,

What version of Jersey are you using?

I cannot reproduce. Can you provide the exact code you are using or a
reproducible test?

Paul.

On Jan 20, 2010, at 7:07 AM, geoffrey hendrey wrote:

> I am using FormDataMultiPart to post a String. The string consists
> of 3 kanji characters for the word てすと (te-su-to)
>
> However, when I call
> FormDataMultiPart.get("theWord").getValue().toString().length(), the
> result is 9 when I expect the result to be 3 (because there are
> three characters).
>
> The UTF-8 byte sequence for these 3 characters is 9 bytes. The
> observed behavior (9) is explainable if Jersey is marshalling each
> of the bytes into a character, instead of properly interpreting the
> 3-byte UTF-8 sequences as characters.
>
> Anyone know how to properly receive UTF-8 characters in Jersey, from
> FormDataMultiPart?
>
> -geoff
>
> --
> http://nextdb.net - RESTful Relational Database
> http://www.nextdb.net/wiki/en/REST