jsr339-experts@jax-rs-spec.java.net

[jsr339-experts] Re: [jax-rs-spec users] Re: Re: Re: Re: UriBuilder.segment and build with encodePathSlash

From: Marek Potociar <marek.potociar_at_oracle.com>
Date: Tue, 23 Oct 2012 20:52:46 +0200

On Oct 23, 2012, at 5:31 PM, Sergey Beryozkin <sberyozkin_at_talend.com> wrote:

> Hi Marek
> On 23/10/12 15:49, Marek Potociar wrote:
>> Sergey,
>>
>> Please do not take me wrong, but frankly, I'm getting tired discussing
>> this at length over and over after I have spent a lot of time discussing
>> and thinking about this several months ago. It took me too quite some
>> time to accept the fact that the current solution is actually most
>> correct one, so I'm ready to repeatedly explain how the things are meant
>> to work, but as for the solution itself, a decision was made and I'm not
>> going to reopen it, neither I want to get involved in a discussion about
>> a different solution once again. Which is why I'm not going to reply to
>> your email remarks inline. But I will try one more time to explain the
>> present solution.
>>
>
> I don't take you wrong, nor I'm expecting any changes at this stage (as I indicated), feel free to ignore the following, but I'm still allowed to talk about it (I guess ?), so this is what I'm doing below, and if you want you can stop reading right here :-)
>
>> As you have demonstrated, there are some use-cases in which the current
>> solution arguably looks seemingly inconsistent especially if you put
>> those use cases on subsequent lines in your code. Still, we're talking
>> about marginal use cases. Also similar inconsistent use cases could be
>> produced for any other solution, for that I am sure. Our main driver
>> behind the decision was to aligned make behavior of URI template
>> variables completion consistent across the whole JAX-RS API, and whether
>> you like it or not, this particular use case was an outlier that
>> differed from other parts of the API.
>>
>> To demonstrate, suppose you have 2 HTTP requests:
>>
>> 1. GET "/foo/bar"
>> 2. GET "/foo%2Fbar"
>>
>> And a resource:
>>
>> @Path("{a}")
>> public class FooBar {
>> @PathParam("a") String fooBar;
>> ...
>> }
>>
>> Which of the 2 requests would you expect to be mapped to the resource?
>> Now I presume your answer would be that #2. And what would you expect to
>> be the value of fooBar field? Again, I presume your answer would be
>> "foo/bar".
>
> This is an orthogonal thing, because by default @PathParam() does capture a segment, no disagreement here, while we've been talking about building things

That's not orthogonal I'm afraid. If you used UriBuilder to produce a URI based on an matching a single request against your resource hierarchy using the values of @Path annotation, you MUST (IMO) be able to re-construct the original request using the values injected into @PathParam fields. E.g.

@Path("root/{a}")
class Resource {
  @Context UriInfo uriInfo;

  @GET
  @Path("{b}")
  public boolean get(@PathParam("a") String a, @PathParam("b") String b) {
    return UriBuilder.fromPath("root/{a}").path("{b}").build(a, b).toString().equals(uriInfo.getPath());
  }
}

This must IMO return true also for requests like:

GET /root/foo%2Fbar/baz

Note that in the above, the value of injected parameter a will be "foo/bar". Maybe you see it now. The above should also work for path(Class) and path(Method), of course. And obviously this behavior is also anticipated and reflected in the Client API...

>
>> But then why would you expect a different behavior on the
>> opposite side of the story in the UriBuilder? With the example above it
>> simply makes more sense to expect that assertion below passes:
>>
>> assertEquals("/foo%2Fbar", UriBuilder.fromPath("{a}").build("foo/bar"));
>>
> I don't expect the different output. This code simply demonstrates what I've been saying, aka, "/" is encoded by default, despite the rules of the path scheme.
>
> assertEquals("/foo%2Fbar", UriBuilder.fromPath("").segment("{a}").build("foo/bar"));
>
> or, if we had fromSegment()
>
> assertEquals("/foo%2Fbar", UriBuilder.fromSegment("{a}").build("foo/bar"));
>
>
> would indeed be consistent.

Again, a value of a template variable is everywhere in JAX-RS by default considered to come from a single path segment. The UriBuilder.segment() method is just a convenience for properly encoding literal value of a newly attached path segment. But ultimately, it does not affect the underlying URI template being built. IOW following must work:

UriBuilder.fromPath("foo").segment("{a}").toTemplate().equals(UriBuilder.fromPath("foo").path("{a}").toTemplate())

The URI template is not contextual wrt. template variables based on the methods of URI builder used to build the tempate. It is contextual based on the URI components in which the variables reside. Now I almost hear you saying that in that case we should not encode "/" if the template variable resides in the context of a URI path. Still, since everywhere else the template variable is understood to be part of a single path segment by default, it makes a lot of sense to apply the same understanding consistently also in case of URI templates produced by a UriBuilder.
>
>> And it really doesn't matter how you arrived to your final URI template
>> in terms of which UriBuilder methods did you use.
>>
>> To sum up, in the majority of cases, in JAX-RS a URI template variable
>> placed in a path component is processed as a (part of a ) single path
>> segment. Thus it made most sense to fix the inconsistent behavior
>> detected in the reference implementation of the UriBuilder.
>>
>> Sometimes things do not work as we would expect, that however doesn't
>> mean they do not work correctly...
>>
>
> That sounds quite philosophical :-)

Indeed. I came up with that one myself and I'm particularly proud of it... ;)

Marek

>
> Sergey
>
>
>> Marek
>>
>> On Oct 23, 2012, at 11:10 AM, Sergey Beryozkin <sberyozkin_at_talend.com
>> <mailto:sberyozkin_at_talend.com>> wrote:
>>
>>> Marek, others,
>>>
>>> please see comments below,
>>>
>>> On 22/10/12 21:48, Sergey Beryozkin wrote:
>>>> On 22/10/12 17:42, Marek Potociar wrote:
>>>>>
>>>>> On Oct 22, 2012, at 5:59 PM, Sergey Beryozkin<sberyozkin_at_talend.com
>>>>> <mailto:sberyozkin_at_talend.com>>
>>>>> wrote:
>>>>>
>>>>>> On 22/10/12 16:37, Marek Potociar wrote:
>>>>>>>
>>>>>>> On Oct 22, 2012, at 5:13 PM, Sergey
>>>>>>> Beryozkin<sberyozkin_at_talend.com <mailto:sberyozkin_at_talend.com>
>>>>>>> <mailto:sberyozkin_at_talend.com>> wrote:
>>>>>>>
>>>>>>>> On 22/10/12 16:07, Sergey Beryozkin wrote:
>>>>>>>>> On 22/10/12 13:43, Marek Potociar wrote:
>>>>>>>>>>
>>>>>>>>>> On Oct 22, 2012, at 1:26 PM, Sergey
>>>>>>>>>> Beryozkin<sberyozkin_at_talend.com <mailto:sberyozkin_at_talend.com>
>>>>>>>>>> <mailto:sberyozkin_at_talend.com>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I know that I'm the last one who still does not get it :-).
>>>>>>>>>>>
>>>>>>>>>>> Consider the following:
>>>>>>>>>>>
>>>>>>>>>>> uriBuilder.fromPath("http://localhost").segment("{a}").build("a/b",
>>>>>>>>>>> false);
>>>>>>>>>>>
>>>>>>>>>>> UriBuilder.segment() states that "/" have to be encoded.
>>>>>>>>>>
>>>>>>>>>> The javadoc of segment() applies to the value passed to the
>>>>>>>>>> segment().
>>>>>>>>>> Not to a template that was added via segment(). IOW, if you passed
>>>>>>>>>> "a/b" into segment() directly, it would be encoded.
>>>>>>>>>
>>>>>>>>> So, to be absolutely clear.
>>>>>>>>>
>>>>>>>>> By default, irrespectively of whether a given template var was
>>>>>>>>> introduced as part of path() or segment() call, the "/" (inside of
>>>>>>>>> this
>>>>>>>>> var's content) is never supposed to be encoded.
>>>>>>>>>
>>>>>>>>> The only way to override it, when working with build(), is to use
>>>>>>>>> a new
>>>>>>>>> flag.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Ouch, it is the other way around,
>>>>>>>>
>>>>>>>> "NOTE: By default all '/' characters in the stringified values
>>>>>>>> will be
>>>>>>>> encoded in path templates, i.e. the result is identical to invoking
>>>>>>>> build(Object[], boolean) build(values, true)}. To override this
>>>>>>>> behavior use build(values, false) instead."
>>>>>>>>
>>>>>>>> Why it is a default, I'm not sure, that definitely breaks one of my
>>>>>>>> tests - but whatever...
>>>>>>>>
>>>>>>>> Just need a confirmation please that this rule applies irrespectively
>>>>>>>> of the "origin" (ex, a var came from path() or segment())
>>>>>>>
>>>>>>> Now, I'm not sure what do you want to confirm? I did confirm that "/"
>>>>>>> encoding in segment(...) does not apply to template variables and does
>>>>>>> apply to string literals in my previous reply, didn't I? Also you
>>>>>>> quote
>>>>>>> javadoc that IMO clearly states the behavior of build(...) method.
>>>>>>> Additionally, the build(...) method javadoc also states that:
>>>>>>>
>>>>>>> "/Values are converted to {_at_code String} using their {_at_code
>>>>>>> toString()}
>>>>>>> method and are then encoded to match the rules of the URI component to
>>>>>>> which they pertain/"
>>>>>>>
>>>>>>> So if you have the same template variable use in different URI
>>>>>>> components, it's value will be encoded separately for each component
>>>>>>> using the encoding rules of the respective URI components.
>>>>>>
>>>>>> Stay with with me please on this one.
>>>>>>
>>>>>> First, I assume this was basically "yes" to my question regarding the
>>>>>> origin of the template variable - I've always thought that the rule
>>>>>> of the method (ex, that of segment() further qualifies the rules of
>>>>>> the encoding of "/").
>>>>>>
>>>>>> So that is not the case, that is fine,
>>>>>>
>>>>>> Now given your quote re the rules of the URI component, tell me
>>>>>> please (and I'm sorry if that was conveyed earlier):
>>>>>>
>>>>>> - why the note above states that by default "/" is supposed to be
>>>>>> encoded where as the rules of URI Path component let "/" stay as is.
>>>>>
>>>>> This was done as a fix of a bug filed against JAX-RS specification -
>>>>> Markus can provide more details as he filed the bug. In summary, this
>>>>> way it is more in sync with what you would expect in @Path annotation,
>>>>> were a template variable resolves by default to a single segment. So
>>>>> it makes more sense to consider a single template in a path to be part
>>>>> of a single segment by default and thus encode the "/" by default.
>>>>>
>>>>> Makes sense?
>>>>
>>>> I'm cautiously optimistic that it does :-)
>>>>
>>>
>>> I'm still excited from the mere fact I believe I've finally understood
>>> the current mechanics of the way "/" gets encoded, thanks Marek for
>>> the patience.
>>>
>>> However, while thinking about it again, I can't help pointing out that
>>> the solution is, to put it mildly, is controversial, not consistent
>>> and does not help the UriBuilder 'case'.
>>>
>>> Here is a short analysis, as short as I can make it.
>>>
>>> 1. assertEquals("a/b", uriBuilder.fromPath("").path("a/b").build());
>>> 2. assertEquals("a%2Fb",
>>> uriBuilder.fromPath("").path("{a}").build("a/b"));
>>>
>>> 3. assertEquals("a%2Fb", uriBuilder.fromPath("").segment("a/b").build());
>>> 4. assertEquals("a%2Fb",
>>> uriBuilder.fromPath("").segment("{a}").build("a/b"));
>>>
>>>
>>> Note 3 and 4 produce the consistent output - it is easy to see why -
>>> the literal value as well as the "{a}" substitution are 'segments'
>>>
>>> 1 and 2 produce the inconsistent results, why, because Markus thought
>>> it was logical to get "/" encoded by default, despite UriBuilder
>>> saying loudly the encoding rules are scoped by the specific URI scheme
>>> rules.
>>> This breaks 2 of CXF tests and I wonder are we breaking the
>>> expectations of the users migrating from 1.1 or not (the fact TCK is
>>> silent on that does not count IMHO).
>>>
>>> Let me step back and review the actual original issue.
>>> We have '?' (meaning unclear) in
>>>
>>> assertEquals(?, uriBuilder.fromPath("").path("{a}").build("a/b"));
>>>
>>> Should "/" in "a/b" be encoded or left as is, who knows ? What if we
>>> have "build(readFromDB())" - where "a/b" is read from the db, etc.
>>>
>>> How we solve this issue ?
>>>
>>> We introduce 4 new methods (2 from the build and 2 from
>>> replaceTemplate families) with this extra flag dedicated to dealing
>>> with "/", a single symbol which happens to be important for the URI
>>> path component.
>>>
>>> We have 1. and 2. above producing inconsistent output where UriBuilder
>>> package-level documentation conflicts with the special rules for
>>> encoding "/".
>>>
>>> We have 4 extra methods which don't make sense with a dangling boolean
>>> flag to be used when no template vars are found in the path component
>>> as opposed to the regular build functions which always make sense.
>>>
>>> So what could've been done instead, to address a tricky issue of
>>>
>>> uriBuilder.fromPath("").path("{a}").build(readFromDb()),
>>> where a return value may have contain "/" which have to be encoded ?
>>>
>>> Well, recommend users, when they face the issue like this, to use
>>>
>>> uriBuilder.fromPath("").segment("{a}").build(readFromDb())
>>>
>>> and get the implementation to remember the 'origin' of the template
>>> var, i.e., whether it came from path() or segment(), we would then see
>>> consistently:
>>>
>>>
>>> 1. assertEquals("a/b", uriBuilder.fromPath("").path("a/b").build());
>>> 2. assertEquals("a/b", uriBuilder.fromPath("").path("{a}").build("a/b"));
>>>
>>> 3. assertEquals("a%2Fb", uriBuilder.fromPath("").segment("a/b").build());
>>> 4. assertEquals("a%2Fb",
>>> uriBuilder.fromPath("").segment("{a}").build("a/b"));
>>>
>>>
>>> Well, I've updated CXF code to support the current approach at the
>>> cost of getting the two failing tests 'fixed', but I felt I had to
>>> type the above
>>>
>>>
>>> Sergey
>>>
>>>>> Marek
>>>>>
>>