dev@glassfish.java.net

Re: REST API and slashes in resource names

From: Ken Paulsen <ken.paulsen_at_oracle.com>
Date: Wed, 02 Jun 2010 13:08:27 -0700

On 06/02/2010 12:43 PM, Tom Mueller wrote:
> On 6/2/2010 2:29 PM, Ken Paulsen wrote:
>>
>>
>> On 06/02/2010 11:59 AM, Bill Shannon wrote:
>>> Ken Paulsen wrote on 06/ 2/10 10:49 AM:
>>>>
>>>> I think the reason it is decoded is this: The Container receives a
>>>> request like "/foo/at%40test.html" which then must be decoded to serve
>>>> up a file named "/foo/at_at_test.html". '@' and many other characters not
>>>> allowed in URLs must be encoded when they're used in a file name,
>>>> so the
>>>> container must decoded them *before attempting to resolve them*.
>>>
>>> I guess the issue is whether that decoding should be done before
>>> interpreting the string to choose the destination servlet or
>>> whether the destination servlet (e.g., the default file handling
>>> servlet) should do the decoding before using the value.
>>>
>>> I would expect the Servlet spec to say something about this. If
>>> it requires the decoding to be done before passing the value to the
>>> Servlet, there may be little we can do.
>>>
>>> If there's a mapping for /foo/bar/*, should a URL of the form
>>> /foo%2Fbar/index.html match it?
>>
>> Because it's %2F, I would think not. But, what about a mapping for
>> /foo_at_bar/*, should /foo%40bar/index.html match it? I think yes
>> (assuming @ is legal... if not, replace with some valid unicode
>> character). The % encoding is required in the URL, so it must be
>> decoded before attempting to match the path. That's just my 2 cents
>> though... and yes, we should see what the spec.
> Chapter 12 of the Servlet 3.0 spec talks about Mapping Requests to
> Servlets. Unfortunately, it doesn't say anything about these cases.
> It does that a '/' is used for path mapping and that other characters
> must have an exact matches.
>
> Regarding your second case, the spec says that URLs in the deployment
> descriptor are assumed to be in URL-decoded form. (page 102). But
> given this, maybe the proper way for the servlet container to match
> URLs is to encode the URL from the deployment descriptor and match
> that with the URL rather than decoding the URL and matching it with
> the deployment descriptor.

I like that idea! Although I wonder if the encoded form is
unambiguous. For example, is "hello world" a) "hello+world" or b)
"hello%20world"? Some other characters are passed unencoded from the
browser (client), which might sometimes be encoded by the Java
implementation of urlencode (or whatever utility we use). So again we
might have trouble getting an exact match.... :|


>
> Tom
>