Re: JAX-RS: UriBuilder encoding

From: Stephan Koops <Stephan.Koops_at_web.de>
Date: Wed, 23 Jul 2008 21:06:45 +0200

Hi,

I think the current approach with the encode attribute is very easy to
use. Your proposal will produce boilerplate code, because you have to
manually call the encode methods. This will not improve the readability
of resource method code, especially if you get data from somewhere
(databse e.g.) and want to use it for building URIs.

So I think -1,

best regards
Stephan

Marc Hadley schrieb:
> OK, I'm convinced. Here's what I propose we do:
>
> (a) Remove the encode and isEncode methods, all methods that add URI
> components will perform contextual encoding of characters that are not
> allowed in the relevant URI component with the following exceptions: {
> and }. % chars followed by two hex digits (the rfc pct-encoded
> production) will not be encoded, other % chars will.
>
> (b) Add a static method that will encode any characters not part of
> the rfc 3986 unreserved production.
>
> (c) Similar to (a), the build method will encode characters that are
> not allowed in the relevant URI components. I.e. any embedded { or }
> will be encoded unlike when adding URI components in (a).
>
> The above will allow creation of any valid URI. The only case that
> developers will have to be careful with is when an input string
> contains a literal % character coincidentally followed by two hex
> digits. The method added by (b) can be used to fix this although it
> won't work if the same string also contains pct-encoded chars - I
> don't think this a big issue since any string obtained from @*Param is
> either encoded or not, you won't get a mixture.
>
> Marc.
>
> On Jul 16, 2008, at 10:42 PM, Manger, James H wrote:
>
>> Use cases for an encoding mode like encoding=true, but where percent
>> chars are NOT escaped (nicknamed “true-%”).
>>
>> Consider http://samplemerchant.info/uri/a%23b/résumé.html
>>
>> This email, current browsers (Safari, Internet Explorer, Firefox 3…),
>> and the HTML source for that web page all display this web address in
>> the same way – including the non-URI character é and the %23 escape
>> sequence (escaping a ‘#’ so it can appear in the path).
>>
>> A cut-n-paste of this address (or just its path) from any of these
>> sources should be accepted by JAX-RS. In particular, it should be
>> accepted by UriBuilder path(…) and as @Path values.
>>
>> This is an example of a string that is NOT “either completely encoded
>> or not encoded at all”. This situation will be increasingly common.
>>
>> With the current spec:
>>
>> UriBuilder.fromPath(“/uri/a%23b/résumé.html”, false) ->
>> IllegalArgumentException
>>
>> UriBuilder.fromPath(“/uri/a%23b/résumé.html”, true).build() ->
>> “/uri/a%2523b/r%C3%A9sum%C3%A9.html” -> 404 NOT FOUND
>>
>> In my suggested true-% mode
>>
>> UriBuilder.fromPath(“/uri/a%23b/résumé.html”).build() ->
>> “/uri/a%23b/r%C3%A9sum%C3%A9.html” -> 200 OK -> @PathParam->
>> “/uri/a#b/résumé.html”
>>
>>
>>
>> Other use cases:
>>
>> Use case 2: Any use case for false mode is also a use case for true-%
>> mode as every string that is valid in false mode (ie does not trigger
>> an IllegalArgumentException) builds exactly the same URI in true-% mode.
>>
>> Use case 3: Almost any use case for true+% mode is also a use case
>> for true-% mode as every string without a percent char builds exactly
>> the same URI in true+% and true-% modes.
>>
>> James Manger
>>
>> _____________________________________________
>> From: Marc.Hadley_at_Sun.COM [mailto:Marc.Hadley_at_Sun.COM]
>> Sent: Thursday, 17 July 2008 2:27 AM
>> To: users_at_jsr311.dev.java.net
>>
>> Yes, with encode=true, the intent was that '%' would be encoded to %25.
>>
>> I kind of imagined that uncontrolled input would be inserted into URI
>> as the values of URI template variables rather than directly as URI
>> components. If this is true then the presence of {} is unlikely to
>> cause an issue. The same applies to % since all three chars would be
>> encoded if encode=true. If you wanted to allow uncontrolled input to
>> include pct-escaped chars as well as other chars that aren't legal
>> then you would have to do some manual processing but I don't see that
>> as a common use case - it seems more common that strings are either
>> completely encoded or not encoded at all. Could you suggest some use-
>> cases where the change you suggest would improve the developer
>> experience.
>>
>> Thanks,
>
> ---
> Marc Hadley <marc.hadley at sun.com>
> CTO Office, Sun Microsystems.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jsr311.dev.java.net
> For additional commands, e-mail: users-help_at_jsr311.dev.java.net
>