users@jersey.java.net

Re: [Jersey] aleluia ! charset problem solved

From: Felipe Gaścho <fgaucho_at_gmail.com>
Date: Tue, 1 Sep 2009 21:18:42 +0200

just a detail:

I am using JPA, but I am also using a script to populate the MySql
with my test data.. and in this script I was forced to set utf8..
just that..

* I suppose if I insert data using the JPA entities, I would not
suffer such problems.. but I still need to confirm that..

from now on I prefer to always force utf8 and get rid off any
problems.. (wondering now to poor guys speaking languages not fully
covered by UTF8 .. they will face the same Jersey problems I was
trying to fix then...)



On Tue, Sep 1, 2009 at 8:39 PM, Alex Sherwin<alex.sherwin_at_acadiasoft.com> wrote:
> If you've noticed if you're using the default mysql encoding and collation
> that string comparisons are case-insensitive.  The default collation for the
> mysql latin encoding is latin1_general_ci, (ci = case insensitive).  You can
> see all the encodings mysql supports with "show collation;".
> If you stick with latin1 you can use the latin1_general_cs, unfortunately
> though if you run:
>
> show collation where collation like '%utf8%';
>
> You will see (assuming you are using a pre-compiled mysql distribution),
> that there are no case-sensitive utf8 collations compiled into MySQL by
> default.  However, you can use utf8_bin, which will go a step further then
> simply enforcing case, but will be a full binary comparison of the
> characters (as opposed to a "_cs" or "_ci" collation which could possibly
> equate different characters of the same meaning from different character
> sets to be equal).
>
> Just some food for thought for you, while you're re-defining your character
> sets in the DB
>
>
> Tatu Saloranta wrote:
>>
>> 2009/9/1 Felipe Gaścho <fgaucho_at_gmail.com>:
>>
>>>
>>> Finally, after my half-hour digging the MySql manuals I finally
>>> figured out my local  problem..
>>>
>>> I am using a script to populate the database, and by default MySql
>>> apply "latin1" as charset .. :(
>>>
>>> since all my Jersey code is UTF-8 - and Jersey Responses assumes UTF-8
>>> by default,
>>>
>>> utf-8 != latin1 :(((
>>>
>>
>> Glad you resolved this. However, I'm not sure I see why this was a
>> problem -- if you used Java strings in between, transcoding should
>> work correctly. That is, transfer encoding (for Jersey
>> requests/responses) should not directly affect persistence format
>> encoding (for DB).
>> Or are you directly storing binary response/request objects somehow?
>>
>> But then again, DBs are notoriously vague with their encoding
>> definitions, using defaults that are not quite obvious from outside;
>> and have been known to cause problems in unexpected places. :-)
>>
>> -+ Tatu +-
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
>> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>
>



-- 
Looking for a client application for this service:
http://fgaucho.dyndns.org:8080/footprint-service/wadl