users@jersey.java.net

RE: [Jersey] Multipart Form And Encoding

From: Jordi Domingo <noseya_at_hotmail.com>
Date: Mon, 2 Nov 2009 15:51:26 +0100

I made a little more investigation and I can say that browser is Correctly encoding in UTF-8.

All the problem is in the server side. Maybe multipart is not calling request.setCharacterEncoding?

Thanks,

Jordi

From: noseya_at_hotmail.com
To: users_at_jersey.dev.java.net
Date: Mon, 2 Nov 2009 15:25:21 +0100
Subject: RE: [Jersey] Multipart Form And Encoding








To be honest, I dont know where does it come from

 Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

My page is fully UTF-8

HTTP/1.1 200 OK
X-Powered-By: JSP/2.1
Server: GlassFish v3
Pragma: No-cache
Cache-Control: no-cache
Expires: Thu, 01 Jan 1970 01:00:00 CET
Content-Type: text/html;charset=UTF-8
Content-Language: ca
Date: Mon, 02 Nov 2009 14:21:42 GMT
Content-Length: 5753


<!doctype html>
<html>
    <head>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

....
<form action="/projects/1/vulnerabilities/29/evidences/9/attachments" class="formulari" id="formulari" method="post" enctype="multipart/form-data" accept-charset="UTF-8">


I hate browsers :S
Date: Mon, 2 Nov 2009 14:47:56 +0100
From: Paul.Sandoz_at_Sun.COM
To: users_at_jersey.dev.java.net
Subject: Re: [Jersey] Multipart Form And Encoding


On Nov 2, 2009, at 2:31 PM, Jordi Domingo wrote:Hi Paul,

My clients are normal browser right now, how can i make them to declare the content-type for each part?

I do not know. If the Content-Type is absent then it means a content type of "text/plain". But under those conditions (unless out of band information is utilized) one has to assume the content is encoded in UTF-8.
The only clue that ISO-8859-1 is utilized is the following in the logged output of the POST request:
  Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
:-(
Paul.

Thanks,

JOrdi

> Date: Mon, 2 Nov 2009 13:40:38 +0100
> From: Paul.Sandoz_at_Sun.COM
> To: users_at_jersey.dev.java.net
> Subject: Re: [Jersey] Multipart Form And Encoding
>
>
> On Nov 2, 2009, at 12:04 PM, Paul Sandoz wrote:
>
> > Hi Jordi,
> >
> > I have verified there is a bug in the FormDataBodyPart.getValue
> > method. It is ignoring the charset parameter (if present) on the
> > media type.
> >
> > In fact this method is doing it's own string parsing when it should
> > be deferring to the string reader provided of the message body reader.
> >
>
> I fixed the above.
>
> So if your client is sending multipart/form content with the
> ISO-8859-1 charset declared appropriately in the Content-Type of
> relevant body parts it should work.
>
> If you are using Jersey's FormDataMultiPart on the client side you
> will have to do something like:
>
> new FormDataMultiPart().field("name", "value",
> MediaType.valueOf("text/plain;charset=ISO-8859-1"));
>
> otherwise if you use the "field" method with two String parameters
> then UTF-8 will be used to encode the String characters. And note that
> if the charset parameter is absent when decoding then UTF-8 will be
> used to decode to String characters.
>
> Paul.
>
> > Can you:
> >
> > 1) verify there is a charset parameter present on the Content-Type
> > of the body part identified as "name"; and
> >
> > 2) try doing the following and seeing if that works (which depends
> > on a charset parameter with a value of "ISO-8859-1" being present)
> >
> > multiPart.getField("name").getValueAs(String.class);
> >
> > Paul.
> >
> >
> >
> > On Nov 2, 2009, at 11:49 AM, Jordi Domingo wrote:
> >
> >> Hi,
> >>
> >> I've got some problems with char encoding using multipart API. I've
> >> been able to solve this in my pc using:
> >>
> >> String name = multiPart.getField("name").getValue();
> >> name = new String(name.getBytes("ISO-8859-1"), "UTF-8");
> >>
> >> But in production it is not working. Chars like àáèéìí ... are
> >> shown like ??
> >>
> >> Any help is appreciated. Thanks,
> >>
> >> Jordi
> >>
> >> Entra al Nuevo Canal Motor y descubre por qué los coches más
> >> rápidos sólo aparcan en MSN. Nuevo diseño, más completo y abierto a
> >> tu opinión.¡Nuevo Canal Motor!
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
> > For additional commands, e-mail: users-help_at_jersey.dev.java.net
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>

En tu material escolar no puede faltar el nuevo Pack de Emoticonos Vuelta al Cole ¡Descárgatelo gratis! Es muy divertido
                                               
En tu material escolar no puede faltar el nuevo Pack de Emoticonos Vuelta al Cole ¡Descárgatelo gratis! Es muy divertido
_________________________________________________________________
¿Sabías que ahora puedes hablar por Messenger desde Hotmail con todos tus contactos? Revisa tu correo mientras conversas con tus amigos.
http://www.hotmail.com
--_d0d4292b-8a78-461c-8f7e-69e94d61d7a0_
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
--></style>
</head>
<body class='hmmessage'>
I made a little more investigation and I can say that browser is Correctly encoding in UTF-8.<br><br>All the problem is in the server side. Maybe multipart is not calling request.setCharacterEncoding?<br><br>Thanks,<br><br>Jordi<br><br><hr id="stopSpelling">From: noseya_at_hotmail.com<br>To: users_at_jersey.dev.java.net<br>Date: Mon, 2 Nov 2009 15:25:21 +0100<br>Subject: RE: [Jersey] Multipart Form And Encoding<br><br>



<style>
.ExternalClass .ecxhmmessage P
{padding:0px;}
.ExternalClass body.ecxhmmessage
{font-size:10pt;font-family:Verdana;}
</style>


To be honest, I dont know where does it come from<br><br>&nbsp;Accept-Charset: <b>ISO-8859-1</b>,utf-8;q=0.7,*;q=0.7<br><br>My page is fully UTF-8<br><br>HTTP/1.1 200 OK<br>X-Powered-By: JSP/2.1<br>Server: GlassFish v3<br>Pragma: No-cache<br>Cache-Control: no-cache<br>Expires: Thu, 01 Jan 1970 01:00:00 CET<br><b>Content-Type: text/html;charset=UTF-8</b><br>Content-Language: ca<br>Date: Mon, 02 Nov 2009 14:21:42 GMT<br>Content-Length: 5753<br><br><br>&lt;!doctype html&gt;<br>&lt;html&gt;<br>&nbsp;&nbsp;&nbsp; &lt;head&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <b>&lt;meta http-equiv="Content-Type" content="text/html; charset=UTF-8"&gt;</b><br><br>....<br>&lt;form action="/projects/1/vulnerabilities/29/evidences/9/attachments" class="formulari" id="formulari" method="post" enctype="multipart/form-data" <b>accept-charset="UTF-8"</b>&gt;<br><br><br>I hate browsers :S<br><hr id="ecxstopSpelling">Date: Mon, 2 Nov 2009 14:47:56 +0100<br>From: Paul.Sandoz@Sun.COM<br>To: users@jersey.dev.java.net<br>Subject: Re: [Jersey] Multipart Form And Encoding<br><br><br><div><div>On Nov 2, 2009, at 2:31 PM, Jordi Domingo wrote:</div><br class="ecxecxApple-interchange-newline"><blockquote><span class="ecxecxApple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div class="ecxecxhmmessage" style="font-size: 10pt; font-family: Verdana;">Hi Paul,<br><br>My clients are normal browser right now, how can i make them to declare the content-type for each part?<br></div></span></blockquote><div><br></div>I do not know. If the Content-Type is absent then it means a content type of "text/plain". But under those conditions (unless out of band information is utilized) one has to assume the content is encoded in UTF-8.</div><div><br></div><div>The only clue that&nbsp;ISO-8859-1 is utilized is the following in the logged output of the POST request:</div><div><br></div><div>&nbsp;&nbsp;Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7</div><div><br></div><div>:-(&nbsp;</div><div><br></div><div>Paul.</div><div><br></div><div><blockquote><span class="ecxecxApple-style-span" style="border-collapse: separate; color: rgb(0, 0, 0); font-family: Helvetica; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><div class="ecxecxhmmessage" style="font-size: 10pt; font-family: Verdana;"><br>Thanks,<br><br>JOrdi<br><br>&gt; Date: Mon, 2 Nov 2009 13:40:38 +0100<br>&gt; From:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:Paul.Sandoz@Sun.COM">Paul.Sandoz@Sun.COM</a><br>&gt; To:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:users@jersey.dev.java.net">users@jersey.dev.java.net</a><br>&gt; Subject: Re: [Jersey] Multipart Form And Encoding<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; On Nov 2, 2009, at 12:04 PM, Paul Sandoz wrote:<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; Hi Jordi,<br>&gt; &gt;<br>&gt; &gt; I have verified there is a bug in the FormDataBodyPart.getValue<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; method. It is ignoring the charset parameter (if present) on the<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; media type.<br>&gt; &gt;<br>&gt; &gt; In fact this method is doing it's own string parsing when it should<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; be deferring to the string reader provided of the message body reader.<br>&gt; &gt;<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; I fixed the above.<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; So if your client is sending multipart/form content with the<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; ISO-8859-1 charset declared appropriately in the Content-Type of<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; relevant body parts it should work.<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; If you are using Jersey's FormDataMultiPart on the client side you<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; will have to do something like:<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; new FormDataMultiPart().field("name", "value",<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; MediaType.valueOf("text/plain;charset=ISO-8859-1"));<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; otherwise if you use the "field" method with two String parameters<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; then UTF-8 will be used to encode the String characters. And note that<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; if the charset parameter is absent when decoding then UTF-8 will be<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; used to decode to String characters.<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; Paul.<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; Can you:<br>&gt; &gt;<br>&gt; &gt; 1) verify there is a charset parameter present on the Content-Type<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; of the body part identified as "name"; and<br>&gt; &gt;<br>&gt; &gt; 2) try doing the following and seeing if that works (which depends<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt; on a charset parameter with a value of "ISO-8859-1" being present)<br>&gt; &gt;<br>&gt; &gt; multiPart.getField("name").getValueAs(String.class);<br>&gt; &gt;<br>&gt; &gt; Paul.<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; On Nov 2, 2009, at 11:49 AM, Jordi Domingo wrote:<br>&gt; &gt;<br>&gt; &gt;&gt; Hi,<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; I've got some problems with char encoding using multipart API. I've<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt;&gt; been able to solve this in my pc using:<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; String name = multiPart.getField("name").getValue();<br>&gt; &gt;&gt; name = new String(name.getBytes("ISO-8859-1"), "UTF-8");<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; But in production it is not working. Chars like àáèéìí ... are<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt;&gt; shown like ??<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; Any help is appreciated. Thanks,<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; Jordi<br>&gt; &gt;&gt;<br>&gt; &gt;&gt; Entra al Nuevo Canal Motor y descubre por qué los coches más<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt;&gt; rápidos sólo aparcan en MSN. Nuevo diseño, más completo y abierto a<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; &gt;&gt; tu opinión.¡Nuevo Canal Motor!<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; ---------------------------------------------------------------------<br>&gt; &gt; To unsubscribe, e-mail:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:users-unsubscribe@jersey.dev.java.net">users-unsubscribe@jersey.dev.java.net</a><br>&gt; &gt; For additional commands, e-mail:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:users-help@jersey.dev.java.net">users-help@jersey.dev.java.net</a><br>&gt; &gt;<br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br>&gt; ---------------------------------------------------------------------<br>&gt; To unsubscribe, e-mail:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:users-unsubscribe@jersey.dev.java.net">users-unsubscribe@jersey.dev.java.net</a><br>&gt; For additional commands, e-mail:<span class="ecxecxApple-converted-space">&nbsp;</span><a href="mailto:users-help@jersey.dev.java.net">users-help@jersey.dev.java.net</a><br>&gt;<span class="ecxecxApple-converted-space">&nbsp;</span><br><br><hr>En tu material escolar no puede faltar el nuevo Pack de Emoticonos Vuelta al Cole<span class="ecxecxApple-converted-space">&nbsp;</span><a href="http://www.vivelive.com/emoticonosvueltaalcole%20%20">¡Descárgatelo gratis! Es muy divertido</a></div></span></blockquote></div><br> <br><hr>En tu material escolar no puede faltar el nuevo Pack de Emoticonos Vuelta al Cole <a href="http://www.vivelive.com/emoticonosvueltaalcole%20%20">¡Descárgatelo gratis! Es muy divertido</a> <br /><hr />¿Para qué descargarte juegos, si tienes los más divertidos online? <a href='http://juegosonline.es.msn.com/' target='_new'>Entra ya en Juegos y prepárate para muchas horas de diversión</a></body>
</html>
--_d0d4292b-8a78-461c-8f7e-69e94d61d7a0_--