[jax-rs-spec users] [jsr339-experts] Re: HEADS-UP: Encoding values of UriBuilder template parameters

From: Sergey Beryozkin <>
Date: Thu, 15 Dec 2011 17:20:06 +0000

On 15/12/11 16:54, Marek Potociar wrote:
> On 12/15/2011 04:30 PM, Sergey Beryozkin wrote:
>> On 15/12/11 15:07, Marek Potociar wrote:
>>> Hi Markus,
>>> I gave it a more thought trying to look at it from a different angle and since the method javadoc does not explicitly
>>> talk about encoding slashes in path template values, this is what I think should be a resolution that you may hopefully
>>> find satisfying:
>>> 1. Add a new method to UriBuilder:
>>> public abstract URI build(Object... values, boolean encodeSlashInPath)
>>> throws IllegalArgumentException, UriBuilderException
>> so what will
>> uriBuilder.segment("a/b/c").build(false) mean ? In other words, what takes the precedence, the segment() processing
>> rules as per its javadocs or the will of the user in build(false) ?
> Sergey, judging from your question I don't think you are familiar with the issue. The code you wrote makes no sense...

Don't you see it's yet another example where in JAX-RS 2.0 we will just
say - no, just don't do it ? This new method will make it possible and
that is utterly wrong

> Marek
>> Sergey
>>> 2. Update javadoc of the existing build(Object... values) to mention
>>> that the behavior corresponds to build(values, true).
>>> 3. Elaborate more on encoding template parameter values in the class
>>> level javadoc.
>>> In summary, the build(Object...) method will behave as you requested. At the same time, the new method will act as a
>>> fall-back for anyone who would depend on the current behavior.
>>> Let me know if this works.
>>> Marek
>>> On 12/08/2011 11:07 AM, Marek Potociar wrote:
>>>> On 12/07/2011 07:55 PM, Markus KARG wrote:
>>>>>>>>> Actually I have to disagree to your conclusion! I do not see any
>>>>>>>> difference in path parameters vs. segment parameters.
>>>>>>>> Path parameter is something that may contain a path (including
>>>>>> slash).
>>>>>>>> Unlike path, a path segment, from the URI spec is something that
>>>>>>>> cannot contain slash. IOW, path parameter, path segment parameter as
>>>>>>>> well as e.g. query or schema parameters have all different encoding
>>>>>>>> schemes as they all define different sets of allowed characters.
>>>>>>>> I think your fault is that you think encoding is depentent of the
>>>>>>>> method, but it is solely dependend of the target.
>>>>>>> No the problem is that the RFC does not name a segment as being a
>>>>>> standalone part of an URI, but says a segment is part of the path part.
>>>>>> There is nothing like a "segment" part of an URI. So I understand what
>>>>>> you like to tell, but it is just wrong to say that it is dependent of
>>>>>> the component, as a segment IS PART OF THE PATH component. As a
>>>>>> conclusion, as there is no segment component, the sole difference is
>>>>>> THE METHOD named either path() or segment(), as segment() IS WRITING
>>>>>> THE COMPONENT NAMED "PATH". Got the point?
>>>>>> Why would we need to constrain parameter encoding to standalone spec
>>>>>> components only? We support segment as a separate parameterized target
>>>>>> as part of the URI builder API. It's IMHO only natural to expect that
>>>>>> as part of this support we will also support the proper encoding for
>>>>>> this parameterized target, right?
>>>>> Sorry but it is just NOT natural. People are used to split URIs into components known as scheme, host, port, path
>>>>> etc. just as those are supported by Java's URI class. But people are not used to the term "segment" necessarily. So
>>>>> you technically can do that, but please do not expect anybody to understand that segment() will encode differently
>>>>> than path() unless you clearly and explicitly tell how that works in the JavaDocs. To support my thesis of
>>>>> "non-naturality" please check this excerpt from the URI JavaDocs:
>>>>> All told, then, a URI instance has the following nine components:
>>>>> Component Type
>>>>> scheme String
>>>>> scheme-specific-part String
>>>>> authority String
>>>>> user-info String
>>>>> host String
>>>>> port int
>>>>> path String
>>>>> query String
>>>>> fragment String
>>>>> I do not see something like "segment" here, so you HAVE to explicitly explain the way it works and what the
>>>>> difference to path() is.
>>>>>>>> Exactly. :) To me, path and path segment are different targets for
>>>>>>>> the reasons explained above.
>>>>>>> I understand how you interpret it, but the WORDS OF THE RFC do NOT
>>>>>> say that segment is a standalone entity but it is PART OF PATH (what
>>>>>> actually do you not understand with that?). So it is NOT a differenct
>>>>>> target, so we have to change the JavaDocs. Our docs are not bound to
>>>>>> your interpretation but have to match the clear words of the RFC.
>>>>>> I disagree. Words of spec do not say that path segment is a separate
>>>>>> URI component. But the spec clearly recognizes path segment as a
>>>>>> separate entity (even as part of the URI grammar):
>>>>>> <quote>
>>>>>> A path consists of a sequence of path segments separated by a slash
>>>>>> ("/") character.
>>>>>> </quote>
>>>>> Are you joking? A segment is an entity but not a component. RFC 2396 clearly names the set of components to be:
>>>>> <first>/<second>;<third>?<fourth>
>>>>> alternatively
>>>>> <scheme>://<authority><path>?<query>
>>>>> and it lists the *components* clearly in the chapters list as to be exactly (not including anything named "segment"):
>>>>> 3.1. Scheme Component
>>>>> 3.2. Authority Component
>>>>> 3.3. Path Component -- This chapter includes the term "segment" just to explain how a path can be splitted into
>>>>> logical pieces but I doubt that all people will actually are so bright necessarily to see from the first look that
>>>>> it is the justification for having different encoding behaviour of *parameter values* (in contrast to *parameter
>>>>> definitions*).
>>>>> 3.4. Query Component
>>>>>> I do understand your reasoning now, but to me, we need to primarily
>>>>>> consider what we support in our API. In our API we clearly recognize
>>>>>> segment as a separate entity. UriBuilder.path("{template}") and
>>>>>> UriBuilder.segment("{template}") simply mean different things IMO, just
>>>>>> as UriBuilder.path("a/b") and UriBuilder.segment("a/b") do.
>>>>> Right. I just say that you must explicitly and unamiguously must explaint in the JavaDocs at one single place how a
>>>>> *parameter value* will be encoded. Nothing else do I ask for. Replace the rather abstract phrase and write the exact
>>>>> resolution for each of that *parameter definition methods* (as it is *not* a URI component, remember?).
>>>>>>>> And the target of both, path and segment, is the URI's path -- so
>>>>>>>> there is no difference (while there is one for the query part,
>>>>>> obviously)!
>>>>>>>> I disagree. Look at the URI spec and check the definition of path
>>>>>> and
>>>>>>>> path segment.
>>>>>>> Sorry but, do YOU please look at the definition of which components a
>>>>>> URI breaks up. It is clearly told that there is NOTHING like a
>>>>>> "segment" component, but ONLY a path, which splits up into segments.
>>>>>> But a segment IS A PATH, not a standalone normative target.
>>>>>> Hmm, segment is most certainly NOT a path. Otherwise using this twisted
>>>>>> logic I would have to conclude that if segment is a PATH then segment
>>>>>> is a standalone URI component, which is an obvious fallacy.
>>>>> There is nothing twisted in this logic but it is the exact definition of RFC 2396. A standalone segment IS a path.
>>>>> See how this screws your argumentation about encoding being bound to components? It is solely bound to the
>>>>> *parameter definition method.
>>>>>> Similarly, segment is clearly defined in URI grammar. What makes you
>>>>>> think that it is not a valid target of a URI-related API? Is query
>>>>>> parameter not a valid part of the API too? It is certainly not part of
>>>>>> the SPEC, worse there is no grammar in the spec that would define a
>>>>>> query parameter...
>>>>> Almost anybody would expect that "target" means "component", and the "component" of a segment is PATH. It is a valid
>>>>> target in the technical sense but it is ambiguous to the reader. Yes, there is nothing like a query parameter by
>>>>> good reason, so a query parameter cannot be a valid target. Maybe the query syntax is totally fancy. So a valid
>>>>> target can be *any* part of the query, like replacing anything right in the middle of a string. The assumption that
>>>>> a query has parameters is invalid.
>>>>>> I'd say that the argumentation based on supporting purely standalone
>>>>>> URI components as part of a URI-based API does not hold. Both - query
>>>>>> parameter as well as path segment - are useful and valid URI targets
>>>>>> for the purpose of UriBuilder API and as such we need to treat them as
>>>>>> first-class citizens, including support for proper independent template
>>>>>> value encoding.
>>>>> I never said it is not useful. I said you must clearly explain the encoding by giving examples for all of the
>>>>> parameter definition methods. Nothing else. You mix up things. I never asked to reduce the functionality. I only
>>>>> demand unambiguous JavaDocs. What is so hard to understand with that?
>>>>>> Also, please consider the following examples:
>>>>>> UriBuilder.from("").userInfo("a/b").path("a/b").segm
>>>>>> ent("a/b").build();
>>>>>> UriBuilder.from("").userInfo("{t1}").path("{t2}").se
>>>>>> gment("{t3}").build("a/b", "a/b", "a/b");
>>>>>> UriBuilder.from("").userInfo("{t1}").path("{t1}").se
>>>>>> gment("{t1}").build("a/b");
>>>>>> IMHO they all mean the same thing and should produce the same URI:
>>>>> See, this is the difference in our view: Our team would expect DIFFERENT results by good reason. That there is a
>>>>> difference between parameter definition and parameter provision, as you are not building a pure string API.
>>>>> path("a/b").build("") and path("{t1}").build("a/b") shall behave different. The first shall render as "a/b" (as the
>>>>> slash is in the parameter definition), the second shall render as "a%2Fb" (as the slash is in the parameter value).
>>>>> Just in analogy to the JDBC API makes a difference between quotes in the parameter definition and quotes in the
>>>>> parameter value. In JDBC, quotes in the parameter definition are literals, but quotes in the parameter provision get
>>>>> escaped. This makes lots of sense, is totally logical, and should be applied to JAX-RS instead of your self-extended
>>>>> type of addition URI parts idea. BTW, we do not need segment() actually: We can just use path().path().path() to
>>>>> build the same.
>>>>> Got my idea?
>>>> Yes. I think I understand your reasoning. (That does not mean I agree with it or support your proposal in JAX_RS_SPEC-70
>>>> though.)
>>>> Check the JAX-RS 1.x UriBuilder javadoc for path(String) and segment(String...):
>>>> It contains explicit and unambiguous explanation of the difference between encoding segment and path template values as
>>>> you requested earlier in your email. What we should do in this space wrt issue JAX_RS_SPEC-70 is to improve the summary
>>>> information about encoding the template values in the class-level javadoc (once we reach an agreement), which is
>>>> currently incomplete and does not explicitly take path segments or matrix parameters into account even though they are
>>>> clearly supported by the API as independent contextually encoded entities.
>>>> I hope that from the referenced javadoc it is clear that in order to support your proposal in JAX_RS_SPEC-70 we would
>>>> need to break the UriBuilder javadoc, which means breaking BW compatibility of the API at the application level, which
>>>> we simply cannot do.
>>>> As I tried to outline earlier, what you want to achieve can be provided via UriBuilder.segment(...) method. This would
>>>> preserve BW-compatibility with the 1.x API.
>>>> One other option that comes to my mind is to add a new build(...) method that would take a flag indicating if the
>>>> slashes in values should encoded for all templates in path component or not. Something like this:
>>>> public abstract URI build(Object... values, boolean encodeSlashInPath) throws IllegalArgumentException,
>>>> UriBuilderException
>>>> The default behavior of build(Object... values) would correspond to build(values, false). The behavior you request could
>>>> be achieved by build(values, true).
>>>> If you see another solution that would be BW-compatible, and still not too complicated let me know.
>>>> Marek
>>>>> Regards
>>>>> Markus

Sergey Beryozkin
Talend Community Coders