users@jersey.java.net

[Jersey] Re: Jersey truncating the slashes from the uploaded file name

From: Paul Sandoz <Paul.Sandoz_at_oracle.com>
Date: Thu, 3 Feb 2011 13:12:35 +0100

On Feb 3, 2011, at 12:51 PM, ManiKanta G wrote:

>
> Hi,
>
> Below is the Fiddler's log for the upload request:
>
> POST http://localhost:9998/ex HTTP/1.1
> Accept: image/gif, image/jpeg, image/pjpeg, image/pjpeg, application/
> x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-
> powerpoint, application/msword, application/xaml+xml, application/x-
> ms-xbap, application/x-ms-application, */*
> Accept-Language: en-us
> User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1;
> Trident/4.0; .NET CLR 2.0.50727; .NET4.0C; .NET4.0E)
> Content-Type: multipart/form-data;
> boundary=---------------------------7db3c8365097e
> Accept-Encoding: gzip, deflate
> Connection: Keep-Alive
> Content-Length: 522
> Host: localhost:9998
> Pragma: no-cache
>
> -----------------------------7db3c8365097e
> Content-Disposition: form-data; name="file"; filename="C:\Documents
> and Settings\User1\Desktop\TODO.txt"
> Content-Type: text/plain
>
> <File content goes here...>
> -----------------------------7db3c8365097e--
>
> But printing the disposition.getFileName() in resource is giving:
>
> C:Documents and SettingsUser1DesktopTODO.txt
>
> I've tested in both IE8 & IE7 and the same behavior is observed.
>

Thanks, that is much clearer.

I am reusing the HTTP header parsing code to parse content-disposition
headers. The '\' character has special meaning:

   http://greenbytes.de/tech/webdav/rfc2616.html#basic.rules.quoted-string

       A string of text is parsed as a single word if it is quoted
using double-quote marks.

     quoted-string = ( <"> *(qdtext | quoted-pair ) <"> )
     qdtext = <any TEXT except <">>
The backslash character ("\") MAY be used as a single-character
quoting mechanism only within quoted-string and comment constructs.

     quoted-pair = "\" CHAR

and that meaning is the same for rfc822:

   http://www.ietf.org/rfc/rfc2183.txt
   http://www.ietf.org/rfc/rfc822.txt

     3.4.4. DELIMITING AND QUOTING CHARACTERS

         The quote character (backslash) and characters that delimit
         syntactic units are not, generally, to be taken as data that
         are part of the delimited or quoted unit(s). In particular,
         the quotation-marks that define a quoted-string, the
         parentheses that define a comment and the backslash that
         quotes a following character are NOT part of the quoted-
         string, comment or quoted character. A quotation-mark that is
         to be part of a quoted-string, a parenthesis that is to be
         part of a comment and a backslash that is to be part of either
         must each be preceded by the quote-character backslash ("\").
         Note that the syntax allows any character to be quoted within
         a quoted-string or comment; however only certain characters
         MUST be quoted to be included as data. These characters are
         the ones that are not part of the alternate text group (i.e.,
         ctext or qtext).

This means IE7 and IE8 are not conforming to the encoding rules of
rfc822. The encoding of the content-disposition header should look
like this:

   Content-Disposition: form-data; name="file"; filename="C:\
\Documents and Settings\\User1\\Desktop\\TODO.txt"

Paul.