users@jersey.java.net

[Jersey] Charset question

From: Simon Roberts <simon_at_dancingcloudphotography.com>
Date: Thu, 2 Apr 2015 08:26:38 -0600

Many thanks Craig and Mark, that's very helpful. (And apologies for the
double post--I first posted from the wrong, i.e. non-regitstered--email,
and though tit bounced. Turns out the kind moderators must have seen a
"relevant" question and forwarded it on anyway).

I guess this leaves one more follow-up, which in a sense doesn't belong on
this forum, but perhaps someone can answer it.

On Mark's prompting, I went and looked at the specification for JSON:

http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-404.pdf

and it calls out:

  "...JSON text is a sequence of Unicode code points..."

then gives a link to the Unicode standard. I poked around a bit, trying to
find out if this is the same thing as UTF-8, and that's not clear to me.
Best I have found so far was a wikipedia entry that says:

   "UTF-8 ... is a character encoding capable of encoding all possible
characters (called code points) in Unicode."

and goes on to suggest there are others. So, I suspect that I should do
what Craig indicates (i.e. state explicitly that I'm using UTF-8 for my
Unicode encoding) but also know that, for example, ISO8859-4 would not be a
valid transfer format?

My guess, especially since I now see that UTF-8 is claimed to be more than
about 65% of all web traffic anyway, that this isn't likely to cause any
problems, but if anyone has more precise info w.r.t. specifications, I'd
still be pleased to be fully informed.

Thanks again Craig and Mark,
Cheers,
Simon


>
> From: Mark Thornton
>
[...]


> My understanding is that JSON is always UTF-8 by definition.
>


> On Wednesday, 1 April 2015, Craig McClanahan
>


> What I do for this is use a produces annotation like this on the server:
>>
>> @Produces("application/json;charset=UTF-8")
>>
>