jsr339-experts@jax-rs-spec.java.net

[jsr339-experts] Re: [jax-rs-spec users] Re: Re: UriBuilder.segment and build with encodePathSlash

From: Sergey Beryozkin <sberyozkin_at_talend.com>
Date: Tue, 23 Oct 2012 10:10:41 +0100

Marek, others,

please see comments below,

On 22/10/12 21:48, Sergey Beryozkin wrote:
> On 22/10/12 17:42, Marek Potociar wrote:
>>
>> On Oct 22, 2012, at 5:59 PM, Sergey Beryozkin<sberyozkin_at_talend.com>
>> wrote:
>>
>>> On 22/10/12 16:37, Marek Potociar wrote:
>>>>
>>>> On Oct 22, 2012, at 5:13 PM, Sergey Beryozkin<sberyozkin_at_talend.com
>>>> <mailto:sberyozkin_at_talend.com>> wrote:
>>>>
>>>>> On 22/10/12 16:07, Sergey Beryozkin wrote:
>>>>>> On 22/10/12 13:43, Marek Potociar wrote:
>>>>>>>
>>>>>>> On Oct 22, 2012, at 1:26 PM, Sergey Beryozkin<sberyozkin_at_talend.com
>>>>>>> <mailto:sberyozkin_at_talend.com>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I know that I'm the last one who still does not get it :-).
>>>>>>>>
>>>>>>>> Consider the following:
>>>>>>>>
>>>>>>>> uriBuilder.fromPath("http://localhost").segment("{a}").build("a/b",
>>>>>>>> false);
>>>>>>>>
>>>>>>>> UriBuilder.segment() states that "/" have to be encoded.
>>>>>>>
>>>>>>> The javadoc of segment() applies to the value passed to the
>>>>>>> segment().
>>>>>>> Not to a template that was added via segment(). IOW, if you passed
>>>>>>> "a/b" into segment() directly, it would be encoded.
>>>>>>
>>>>>> So, to be absolutely clear.
>>>>>>
>>>>>> By default, irrespectively of whether a given template var was
>>>>>> introduced as part of path() or segment() call, the "/" (inside of
>>>>>> this
>>>>>> var's content) is never supposed to be encoded.
>>>>>>
>>>>>> The only way to override it, when working with build(), is to use
>>>>>> a new
>>>>>> flag.
>>>>>>
>>>>>
>>>>> Ouch, it is the other way around,
>>>>>
>>>>> "NOTE: By default all '/' characters in the stringified values will be
>>>>> encoded in path templates, i.e. the result is identical to invoking
>>>>> build(Object[], boolean) build(values, true)}. To override this
>>>>> behavior use build(values, false) instead."
>>>>>
>>>>> Why it is a default, I'm not sure, that definitely breaks one of my
>>>>> tests - but whatever...
>>>>>
>>>>> Just need a confirmation please that this rule applies irrespectively
>>>>> of the "origin" (ex, a var came from path() or segment())
>>>>
>>>> Now, I'm not sure what do you want to confirm? I did confirm that "/"
>>>> encoding in segment(...) does not apply to template variables and does
>>>> apply to string literals in my previous reply, didn't I? Also you quote
>>>> javadoc that IMO clearly states the behavior of build(...) method.
>>>> Additionally, the build(...) method javadoc also states that:
>>>>
>>>> "/Values are converted to {_at_code String} using their {_at_code toString()}
>>>> method and are then encoded to match the rules of the URI component to
>>>> which they pertain/"
>>>>
>>>> So if you have the same template variable use in different URI
>>>> components, it's value will be encoded separately for each component
>>>> using the encoding rules of the respective URI components.
>>>
>>> Stay with with me please on this one.
>>>
>>> First, I assume this was basically "yes" to my question regarding the
>>> origin of the template variable - I've always thought that the rule
>>> of the method (ex, that of segment() further qualifies the rules of
>>> the encoding of "/").
>>>
>>> So that is not the case, that is fine,
>>>
>>> Now given your quote re the rules of the URI component, tell me
>>> please (and I'm sorry if that was conveyed earlier):
>>>
>>> - why the note above states that by default "/" is supposed to be
>>> encoded where as the rules of URI Path component let "/" stay as is.
>>
>> This was done as a fix of a bug filed against JAX-RS specification -
>> Markus can provide more details as he filed the bug. In summary, this
>> way it is more in sync with what you would expect in @Path annotation,
>> were a template variable resolves by default to a single segment. So
>> it makes more sense to consider a single template in a path to be part
>> of a single segment by default and thus encode the "/" by default.
>>
>> Makes sense?
>
> I'm cautiously optimistic that it does :-)
>

I'm still excited from the mere fact I believe I've finally understood
the current mechanics of the way "/" gets encoded, thanks Marek for the
patience.

However, while thinking about it again, I can't help pointing out that
the solution is, to put it mildly, is controversial, not consistent and
does not help the UriBuilder 'case'.

Here is a short analysis, as short as I can make it.

1. assertEquals("a/b", uriBuilder.fromPath("").path("a/b").build());
2. assertEquals("a%2Fb", uriBuilder.fromPath("").path("{a}").build("a/b"));

3. assertEquals("a%2Fb", uriBuilder.fromPath("").segment("a/b").build());
4. assertEquals("a%2Fb",
uriBuilder.fromPath("").segment("{a}").build("a/b"));


Note 3 and 4 produce the consistent output - it is easy to see why - the
literal value as well as the "{a}" substitution are 'segments'

1 and 2 produce the inconsistent results, why, because Markus thought it
was logical to get "/" encoded by default, despite UriBuilder saying
loudly the encoding rules are scoped by the specific URI scheme rules.
This breaks 2 of CXF tests and I wonder are we breaking the expectations
of the users migrating from 1.1 or not (the fact TCK is silent on that
does not count IMHO).

Let me step back and review the actual original issue.
We have '?' (meaning unclear) in

assertEquals(?, uriBuilder.fromPath("").path("{a}").build("a/b"));

Should "/" in "a/b" be encoded or left as is, who knows ? What if we
have "build(readFromDB())" - where "a/b" is read from the db, etc.

How we solve this issue ?

We introduce 4 new methods (2 from the build and 2 from replaceTemplate
families) with this extra flag dedicated to dealing with "/", a single
symbol which happens to be important for the URI path component.

We have 1. and 2. above producing inconsistent output where UriBuilder
package-level documentation conflicts with the special rules for
encoding "/".

We have 4 extra methods which don't make sense with a dangling boolean
flag to be used when no template vars are found in the path component as
opposed to the regular build functions which always make sense.

So what could've been done instead, to address a tricky issue of

uriBuilder.fromPath("").path("{a}").build(readFromDb()),
where a return value may have contain "/" which have to be encoded ?

Well, recommend users, when they face the issue like this, to use

uriBuilder.fromPath("").segment("{a}").build(readFromDb())

and get the implementation to remember the 'origin' of the template var,
i.e., whether it came from path() or segment(), we would then see
consistently:


1. assertEquals("a/b", uriBuilder.fromPath("").path("a/b").build());
2. assertEquals("a/b", uriBuilder.fromPath("").path("{a}").build("a/b"));

3. assertEquals("a%2Fb", uriBuilder.fromPath("").segment("a/b").build());
4. assertEquals("a%2Fb",
uriBuilder.fromPath("").segment("{a}").build("a/b"));


Well, I've updated CXF code to support the current approach at the cost
of getting the two failing tests 'fixed', but I felt I had to type the above


Sergey

>> Marek
>>