[jax-rs-spec users] [jsr339-experts] Re: HEADS-UP: Encoding values of UriBuilder template parameters

From: Marek Potociar <>
Date: Thu, 15 Dec 2011 16:07:29 +0100

Hi Markus,

I gave it a more thought trying to look at it from a different angle and since the method javadoc does not explicitly
talk about encoding slashes in path template values, this is what I think should be a resolution that you may hopefully
find satisfying:

1. Add a new method to UriBuilder:
public abstract URI build(Object... values, boolean encodeSlashInPath)
throws IllegalArgumentException, UriBuilderException

2. Update javadoc of the existing build(Object... values) to mention
that the behavior corresponds to build(values, true).

3. Elaborate more on encoding template parameter values in the class
level javadoc.

In summary, the build(Object...) method will behave as you requested. At the same time, the new method will act as a
fall-back for anyone who would depend on the current behavior.

Let me know if this works.


On 12/08/2011 11:07 AM, Marek Potociar wrote:
> On 12/07/2011 07:55 PM, Markus KARG wrote:
>>>>>> Actually I have to disagree to your conclusion! I do not see any
>>>>> difference in path parameters vs. segment parameters.
>>>>> Path parameter is something that may contain a path (including
>>> slash).
>>>>> Unlike path, a path segment, from the URI spec is something that
>>>>> cannot contain slash. IOW, path parameter, path segment parameter as
>>>>> well as e.g. query or schema parameters have all different encoding
>>>>> schemes as they all define different sets of allowed characters.
>>>>> I think your fault is that you think encoding is depentent of the
>>>>> method, but it is solely dependend of the target.
>>>> No the problem is that the RFC does not name a segment as being a
>>> standalone part of an URI, but says a segment is part of the path part.
>>> There is nothing like a "segment" part of an URI. So I understand what
>>> you like to tell, but it is just wrong to say that it is dependent of
>>> the component, as a segment IS PART OF THE PATH component. As a
>>> conclusion, as there is no segment component, the sole difference is
>>> THE METHOD named either path() or segment(), as segment() IS WRITING
>>> THE COMPONENT NAMED "PATH". Got the point?
>>> Why would we need to constrain parameter encoding to standalone spec
>>> components only? We support segment as a separate parameterized target
>>> as part of the URI builder API. It's IMHO only natural to expect that
>>> as part of this support we will also support the proper encoding for
>>> this parameterized target, right?
>> Sorry but it is just NOT natural. People are used to split URIs into components known as scheme, host, port, path etc. just as those are supported by Java's URI class. But people are not used to the term "segment" necessarily. So you technically can do that, but please do not expect anybody to understand that segment() will encode differently than path() unless you clearly and explicitly tell how that works in the JavaDocs. To support my thesis of "non-naturality" please check this excerpt from the URI JavaDocs:
>> All told, then, a URI instance has the following nine components:
>> Component Type
>> scheme String
>> scheme-specific-part String
>> authority String
>> user-info String
>> host String
>> port int
>> path String
>> query String
>> fragment String
>> I do not see something like "segment" here, so you HAVE to explicitly explain the way it works and what the difference to path() is.
>>>>> Exactly. :) To me, path and path segment are different targets for
>>>>> the reasons explained above.
>>>> I understand how you interpret it, but the WORDS OF THE RFC do NOT
>>> say that segment is a standalone entity but it is PART OF PATH (what
>>> actually do you not understand with that?). So it is NOT a differenct
>>> target, so we have to change the JavaDocs. Our docs are not bound to
>>> your interpretation but have to match the clear words of the RFC.
>>> I disagree. Words of spec do not say that path segment is a separate
>>> URI component. But the spec clearly recognizes path segment as a
>>> separate entity (even as part of the URI grammar):
>>> <quote>
>>> A path consists of a sequence of path segments separated by a slash
>>> ("/") character.
>>> </quote>
>> Are you joking? A segment is an entity but not a component. RFC 2396 clearly names the set of components to be:
>> <first>/<second>;<third>?<fourth>
>> alternatively
>> <scheme>://<authority><path>?<query>
>> and it lists the *components* clearly in the chapters list as to be exactly (not including anything named "segment"):
>> 3.1. Scheme Component
>> 3.2. Authority Component
>> 3.3. Path Component -- This chapter includes the term "segment" just to explain how a path can be splitted into logical pieces but I doubt that all people will actually are so bright necessarily to see from the first look that it is the justification for having different encoding behaviour of *parameter values* (in contrast to *parameter definitions*).
>> 3.4. Query Component
>>> I do understand your reasoning now, but to me, we need to primarily
>>> consider what we support in our API. In our API we clearly recognize
>>> segment as a separate entity. UriBuilder.path("{template}") and
>>> UriBuilder.segment("{template}") simply mean different things IMO, just
>>> as UriBuilder.path("a/b") and UriBuilder.segment("a/b") do.
>> Right. I just say that you must explicitly and unamiguously must explaint in the JavaDocs at one single place how a *parameter value* will be encoded. Nothing else do I ask for. Replace the rather abstract phrase and write the exact resolution for each of that *parameter definition methods* (as it is *not* a URI component, remember?).
>>>>> And the target of both, path and segment, is the URI's path -- so
>>>>> there is no difference (while there is one for the query part,
>>> obviously)!
>>>>> I disagree. Look at the URI spec and check the definition of path
>>> and
>>>>> path segment.
>>>> Sorry but, do YOU please look at the definition of which components a
>>> URI breaks up. It is clearly told that there is NOTHING like a
>>> "segment" component, but ONLY a path, which splits up into segments.
>>> But a segment IS A PATH, not a standalone normative target.
>>> Hmm, segment is most certainly NOT a path. Otherwise using this twisted
>>> logic I would have to conclude that if segment is a PATH then segment
>>> is a standalone URI component, which is an obvious fallacy.
>> There is nothing twisted in this logic but it is the exact definition of RFC 2396. A standalone segment IS a path. See how this screws your argumentation about encoding being bound to components? It is solely bound to the *parameter definition method.
>>> Similarly, segment is clearly defined in URI grammar. What makes you
>>> think that it is not a valid target of a URI-related API? Is query
>>> parameter not a valid part of the API too? It is certainly not part of
>>> the SPEC, worse there is no grammar in the spec that would define a
>>> query parameter...
>> Almost anybody would expect that "target" means "component", and the "component" of a segment is PATH. It is a valid target in the technical sense but it is ambiguous to the reader. Yes, there is nothing like a query parameter by good reason, so a query parameter cannot be a valid target. Maybe the query syntax is totally fancy. So a valid target can be *any* part of the query, like replacing anything right in the middle of a string. The assumption that a query has parameters is invalid.
>>> I'd say that the argumentation based on supporting purely standalone
>>> URI components as part of a URI-based API does not hold. Both - query
>>> parameter as well as path segment - are useful and valid URI targets
>>> for the purpose of UriBuilder API and as such we need to treat them as
>>> first-class citizens, including support for proper independent template
>>> value encoding.
>> I never said it is not useful. I said you must clearly explain the encoding by giving examples for all of the parameter definition methods. Nothing else. You mix up things. I never asked to reduce the functionality. I only demand unambiguous JavaDocs. What is so hard to understand with that?
>>> Also, please consider the following examples:
>>> UriBuilder.from("").userInfo("a/b").path("a/b").segm
>>> ent("a/b").build();
>>> UriBuilder.from("").userInfo("{t1}").path("{t2}").se
>>> gment("{t3}").build("a/b", "a/b", "a/b");
>>> UriBuilder.from("").userInfo("{t1}").path("{t1}").se
>>> gment("{t1}").build("a/b");
>>> IMHO they all mean the same thing and should produce the same URI:
>> See, this is the difference in our view: Our team would expect DIFFERENT results by good reason. That there is a difference between parameter definition and parameter provision, as you are not building a pure string API. path("a/b").build("") and path("{t1}").build("a/b") shall behave different. The first shall render as "a/b" (as the slash is in the parameter definition), the second shall render as "a%2Fb" (as the slash is in the parameter value). Just in analogy to the JDBC API makes a difference between quotes in the parameter definition and quotes in the parameter value. In JDBC, quotes in the parameter definition are literals, but quotes in the parameter provision get escaped. This makes lots of sense, is totally logical, and should be applied to JAX-RS instead of your self-extended type of addition URI parts idea. BTW, we do not need segment() actually: We can just use path().path().path() to build the same.
>> Got my idea?
> Yes. I think I understand your reasoning. (That does not mean I agree with it or support your proposal in JAX_RS_SPEC-70
> though.)
> Check the JAX-RS 1.x UriBuilder javadoc for path(String) and segment(String...):
> It contains explicit and unambiguous explanation of the difference between encoding segment and path template values as
> you requested earlier in your email. What we should do in this space wrt issue JAX_RS_SPEC-70 is to improve the summary
> information about encoding the template values in the class-level javadoc (once we reach an agreement), which is
> currently incomplete and does not explicitly take path segments or matrix parameters into account even though they are
> clearly supported by the API as independent contextually encoded entities.
> I hope that from the referenced javadoc it is clear that in order to support your proposal in JAX_RS_SPEC-70 we would
> need to break the UriBuilder javadoc, which means breaking BW compatibility of the API at the application level, which
> we simply cannot do.
> As I tried to outline earlier, what you want to achieve can be provided via UriBuilder.segment(...) method. This would
> preserve BW-compatibility with the 1.x API.
> One other option that comes to my mind is to add a new build(...) method that would take a flag indicating if the
> slashes in values should encoded for all templates in path component or not. Something like this:
> public abstract URI build(Object... values, boolean encodeSlashInPath) throws IllegalArgumentException, UriBuilderException
> The default behavior of build(Object... values) would correspond to build(values, false). The behavior you request could
> be achieved by build(values, true).
> If you see another solution that would be BW-compatible, and still not too complicated let me know.
> Marek
>> Regards
>> Markus