users@jsr311.java.net

RE: JAX-RS: UriBuilder encoding

From: Manger, James H <James.H.Manger_at_team.telstra.com>
Date: Wed, 23 Jul 2008 12:15:00 +1000

> (a) Remove the encode and isEncode methods

Great.
The @Path encode attribute should similarly be removed (but not @Encoded of course).

> (c) Similar to (a), the build method will encode characters that are not allowed in the relevant URI components

Will ‘/’ be %-escaped for limited placeholders?
When is the “relevant URI component” a segment (‘/’ not allowed so %-escaped) and when is it a path (‘/’ allowed so not escaped)?

Consider 4 lines of code:

 1. UriBuilder.fromResource( @Path(“home/{userid}/friends”) )
 2. ub.path(“home/{userid}/friends”)
 3. ub.path(“home”, “{userid}”, “friends”)
 4. ub.path(“home”).path(“{userid}”).path(“friends”)

The “relevant URI component” needs to be a segment for #1 (to be compatible with the @Path matching behaviour). It would cause the least surprise if #2 behaved like #1 (same value passed via @Path or directly is handled the same way). Reading #3 I would strongly guess that the author intended {userid} to be a single segment. #4… less sure.

UriBuilder#path(String… segments), however, says existing ‘/’ chars are preserved (not %-escaped). This implies that for #2, #3 and #4 the “relevant URI component” is a path – not a segment – so ‘/’ would never be %-escaped. Consequently, a userid value of “../../stuff” or “fred/enemies” is likely to cause results that the developers did not intend.


Possible solutions:

 1. %-escape all non-unreserved chars in values passed to build(…) – but then supporting unlimited placeholders is not trivial.
 2. Rename path(String… segments) to segment(String… segments) and state that ‘/’ chars are %-escaped. A static fromSegment(String) method might also be needed. This would make it clear what the “relevant URI component” is.
 3. Nothing (other than documenting that the relevant URI component is a segment for limited @Path placeholders, a path for unlimited @Path placeholders, and a path for fromPath(String) and path(String…) methods). Programmers need to explicitly call quote(String) (or replaceAll(“/”, “%2F”)) when they want a single segment and when filling a Map to pass to build() (though they won’t always bother).

How about replacing the existing fromResource, fromPath and 4 path methods with:
  static UriBuilder fromResource(Class);
  UriBuilder resource(Class);
  UriBuilder resource(Class, String method);
  UriBuilder resource(Method…);

  static UriBuilder fromPath(String);
  UriBuilder path(String);

  static UriBuilder fromSegment(String…);
  UriBuilder segment(String…);

Those make it clearer what the relevant URI component is, for the arguments and for placeholders they contain.


_____________________________________________
From: Marc.Hadley_at_Sun.COM [mailto:Marc.Hadley_at_Sun.COM]
Sent: Wednesday, 23 July 2008 1:58 AM
To: users_at_jsr311.dev.java.net

OK, I'm convinced. Here's what I propose we do:

(a) Remove the encode and isEncode methods, all methods that add URI components will perform contextual encoding of characters that are not allowed in the relevant URI component with the following exceptions:
{ and }. % chars followed by two hex digits (the rfc pct-encoded
production) will not be encoded, other % chars will.

(b) Add a static method that will encode any characters not part of the rfc 3986 unreserved production.

(c) Similar to (a), the build method will encode characters that are not allowed in the relevant URI components. I.e. any embedded { or } will be encoded unlike when adding URI components in (a).

The above will allow creation of any valid URI. The only case that developers will have to be careful with is when an input string contains a literal % character coincidentally followed by two hex digits. The method added by (b) can be used to fix this although it won't work if the same string also contains pct-encoded chars - I don't think this a big issue since any string obtained from @*Param is either encoded or not, you won't get a mixture.