users@jax-rs-spec.java.net

[jax-rs-spec users] [jsr339-experts] Re: HEADS-UP: Encoding values of UriBuilder template parameters

From: Marek Potociar <marek.potociar_at_oracle.com>
Date: Thu, 15 Dec 2011 17:54:53 +0100

On 12/15/2011 04:30 PM, Sergey Beryozkin wrote:
> On 15/12/11 15:07, Marek Potociar wrote:
>> Hi Markus,
>>
>> I gave it a more thought trying to look at it from a different angle and since the method javadoc does not explicitly
>> talk about encoding slashes in path template values, this is what I think should be a resolution that you may hopefully
>> find satisfying:
>>
>> 1. Add a new method to UriBuilder:
>> public abstract URI build(Object... values, boolean encodeSlashInPath)
>> throws IllegalArgumentException, UriBuilderException
>>
>
> so what will
>
> uriBuilder.segment("a/b/c").build(false) mean ? In other words, what takes the precedence, the segment() processing
> rules as per its javadocs or the will of the user in build(false) ?

Sergey, judging from your question I don't think you are familiar with the issue. The code you wrote makes no sense...

Marek

>
>
> Sergey
>
>
>> 2. Update javadoc of the existing build(Object... values) to mention
>> that the behavior corresponds to build(values, true).
>>
>> 3. Elaborate more on encoding template parameter values in the class
>> level javadoc.
>>
>> In summary, the build(Object...) method will behave as you requested. At the same time, the new method will act as a
>> fall-back for anyone who would depend on the current behavior.
>>
>> Let me know if this works.
>>
>> Marek
>>
>> On 12/08/2011 11:07 AM, Marek Potociar wrote:
>>>
>>>
>>> On 12/07/2011 07:55 PM, Markus KARG wrote:
>>>>>>>> Actually I have to disagree to your conclusion! I do not see any
>>>>>>> difference in path parameters vs. segment parameters.
>>>>>>>
>>>>>>> Path parameter is something that may contain a path (including
>>>>> slash).
>>>>>>> Unlike path, a path segment, from the URI spec is something that
>>>>>>> cannot contain slash. IOW, path parameter, path segment parameter as
>>>>>>> well as e.g. query or schema parameters have all different encoding
>>>>>>> schemes as they all define different sets of allowed characters.
>>>>>>>
>>>>>>> I think your fault is that you think encoding is depentent of the
>>>>>>> method, but it is solely dependend of the target.
>>>>>>
>>>>>> No the problem is that the RFC does not name a segment as being a
>>>>> standalone part of an URI, but says a segment is part of the path part.
>>>>> There is nothing like a "segment" part of an URI. So I understand what
>>>>> you like to tell, but it is just wrong to say that it is dependent of
>>>>> the component, as a segment IS PART OF THE PATH component. As a
>>>>> conclusion, as there is no segment component, the sole difference is
>>>>> THE METHOD named either path() or segment(), as segment() IS WRITING
>>>>> THE COMPONENT NAMED "PATH". Got the point?
>>>>>
>>>>> Why would we need to constrain parameter encoding to standalone spec
>>>>> components only? We support segment as a separate parameterized target
>>>>> as part of the URI builder API. It's IMHO only natural to expect that
>>>>> as part of this support we will also support the proper encoding for
>>>>> this parameterized target, right?
>>>>
>>>> Sorry but it is just NOT natural. People are used to split URIs into components known as scheme, host, port, path
>>>> etc. just as those are supported by Java's URI class. But people are not used to the term "segment" necessarily. So
>>>> you technically can do that, but please do not expect anybody to understand that segment() will encode differently
>>>> than path() unless you clearly and explicitly tell how that works in the JavaDocs. To support my thesis of
>>>> "non-naturality" please check this excerpt from the URI JavaDocs:
>>>>
>>>> All told, then, a URI instance has the following nine components:
>>>>
>>>> Component Type
>>>> scheme String
>>>> scheme-specific-part String
>>>> authority String
>>>> user-info String
>>>> host String
>>>> port int
>>>> path String
>>>> query String
>>>> fragment String
>>>>
>>>> I do not see something like "segment" here, so you HAVE to explicitly explain the way it works and what the
>>>> difference to path() is.
>>>>
>>>>>>> Exactly. :) To me, path and path segment are different targets for
>>>>>>> the reasons explained above.
>>>>>>
>>>>>> I understand how you interpret it, but the WORDS OF THE RFC do NOT
>>>>> say that segment is a standalone entity but it is PART OF PATH (what
>>>>> actually do you not understand with that?). So it is NOT a differenct
>>>>> target, so we have to change the JavaDocs. Our docs are not bound to
>>>>> your interpretation but have to match the clear words of the RFC.
>>>>>
>>>>> I disagree. Words of spec do not say that path segment is a separate
>>>>> URI component. But the spec clearly recognizes path segment as a
>>>>> separate entity (even as part of the URI grammar):
>>>>>
>>>>> <quote>
>>>>> A path consists of a sequence of path segments separated by a slash
>>>>> ("/") character.
>>>>> </quote>
>>>>
>>>> Are you joking? A segment is an entity but not a component. RFC 2396 clearly names the set of components to be:
>>>>
>>>> <first>/<second>;<third>?<fourth>
>>>>
>>>> alternatively
>>>>
>>>> <scheme>://<authority><path>?<query>
>>>>
>>>> and it lists the *components* clearly in the chapters list as to be exactly (not including anything named "segment"):
>>>>
>>>> 3.1. Scheme Component
>>>>
>>>> 3.2. Authority Component
>>>>
>>>> 3.3. Path Component -- This chapter includes the term "segment" just to explain how a path can be splitted into
>>>> logical pieces but I doubt that all people will actually are so bright necessarily to see from the first look that
>>>> it is the justification for having different encoding behaviour of *parameter values* (in contrast to *parameter
>>>> definitions*).
>>>>
>>>> 3.4. Query Component
>>>>
>>>>> I do understand your reasoning now, but to me, we need to primarily
>>>>> consider what we support in our API. In our API we clearly recognize
>>>>> segment as a separate entity. UriBuilder.path("{template}") and
>>>>> UriBuilder.segment("{template}") simply mean different things IMO, just
>>>>> as UriBuilder.path("a/b") and UriBuilder.segment("a/b") do.
>>>>
>>>> Right. I just say that you must explicitly and unamiguously must explaint in the JavaDocs at one single place how a
>>>> *parameter value* will be encoded. Nothing else do I ask for. Replace the rather abstract phrase and write the exact
>>>> resolution for each of that *parameter definition methods* (as it is *not* a URI component, remember?).
>>>>
>>>>>>> And the target of both, path and segment, is the URI's path -- so
>>>>>>> there is no difference (while there is one for the query part,
>>>>> obviously)!
>>>>>>>
>>>>>>> I disagree. Look at the URI spec and check the definition of path
>>>>> and
>>>>>>> path segment.
>>>>>>
>>>>>> Sorry but, do YOU please look at the definition of which components a
>>>>> URI breaks up. It is clearly told that there is NOTHING like a
>>>>> "segment" component, but ONLY a path, which splits up into segments.
>>>>> But a segment IS A PATH, not a standalone normative target.
>>>>>
>>>>> Hmm, segment is most certainly NOT a path. Otherwise using this twisted
>>>>> logic I would have to conclude that if segment is a PATH then segment
>>>>> is a standalone URI component, which is an obvious fallacy.
>>>>
>>>> There is nothing twisted in this logic but it is the exact definition of RFC 2396. A standalone segment IS a path.
>>>> See how this screws your argumentation about encoding being bound to components? It is solely bound to the
>>>> *parameter definition method.
>>>>
>>>>> Similarly, segment is clearly defined in URI grammar. What makes you
>>>>> think that it is not a valid target of a URI-related API? Is query
>>>>> parameter not a valid part of the API too? It is certainly not part of
>>>>> the SPEC, worse there is no grammar in the spec that would define a
>>>>> query parameter...
>>>>
>>>> Almost anybody would expect that "target" means "component", and the "component" of a segment is PATH. It is a valid
>>>> target in the technical sense but it is ambiguous to the reader. Yes, there is nothing like a query parameter by
>>>> good reason, so a query parameter cannot be a valid target. Maybe the query syntax is totally fancy. So a valid
>>>> target can be *any* part of the query, like replacing anything right in the middle of a string. The assumption that
>>>> a query has parameters is invalid.
>>>>
>>>>> I'd say that the argumentation based on supporting purely standalone
>>>>> URI components as part of a URI-based API does not hold. Both - query
>>>>> parameter as well as path segment - are useful and valid URI targets
>>>>> for the purpose of UriBuilder API and as such we need to treat them as
>>>>> first-class citizens, including support for proper independent template
>>>>> value encoding.
>>>>
>>>> I never said it is not useful. I said you must clearly explain the encoding by giving examples for all of the
>>>> parameter definition methods. Nothing else. You mix up things. I never asked to reduce the functionality. I only
>>>> demand unambiguous JavaDocs. What is so hard to understand with that?
>>>>
>>>>>
>>>>> Also, please consider the following examples:
>>>>>
>>>>> UriBuilder.from("http://example.com/").userInfo("a/b").path("a/b").segm
>>>>> ent("a/b").build();
>>>>> UriBuilder.from("http://example.com/").userInfo("{t1}").path("{t2}").se
>>>>> gment("{t3}").build("a/b", "a/b", "a/b");
>>>>> UriBuilder.from("http://example.com/").userInfo("{t1}").path("{t1}").se
>>>>> gment("{t1}").build("a/b");
>>>>>
>>>>> IMHO they all mean the same thing and should produce the same URI:
>>>>>
>>>>> http://a%2Fb@example.com/a/b/a%2Fb
>>>>
>>>> See, this is the difference in our view: Our team would expect DIFFERENT results by good reason. That there is a
>>>> difference between parameter definition and parameter provision, as you are not building a pure string API.
>>>> path("a/b").build("") and path("{t1}").build("a/b") shall behave different. The first shall render as "a/b" (as the
>>>> slash is in the parameter definition), the second shall render as "a%2Fb" (as the slash is in the parameter value).
>>>> Just in analogy to the JDBC API makes a difference between quotes in the parameter definition and quotes in the
>>>> parameter value. In JDBC, quotes in the parameter definition are literals, but quotes in the parameter provision get
>>>> escaped. This makes lots of sense, is totally logical, and should be applied to JAX-RS instead of your self-extended
>>>> type of addition URI parts idea. BTW, we do not need segment() actually: We can just use path().path().path() to
>>>> build the same.
>>>>
>>>> Got my idea?
>>>
>>> Yes. I think I understand your reasoning. (That does not mean I agree with it or support your proposal in JAX_RS_SPEC-70
>>> though.)
>>>
>>> Check the JAX-RS 1.x UriBuilder javadoc for path(String) and segment(String...):
>>>
>>> http://jsr311.java.net/nonav/releases/1.1/javax/ws/rs/core/UriBuilder.html#path%28java.lang.String%29
>>> http://jsr311.java.net/nonav/releases/1.1/javax/ws/rs/core/UriBuilder.html#segment%28java.lang.String...%29
>>>
>>> It contains explicit and unambiguous explanation of the difference between encoding segment and path template values as
>>> you requested earlier in your email. What we should do in this space wrt issue JAX_RS_SPEC-70 is to improve the summary
>>> information about encoding the template values in the class-level javadoc (once we reach an agreement), which is
>>> currently incomplete and does not explicitly take path segments or matrix parameters into account even though they are
>>> clearly supported by the API as independent contextually encoded entities.
>>>
>>> I hope that from the referenced javadoc it is clear that in order to support your proposal in JAX_RS_SPEC-70 we would
>>> need to break the UriBuilder javadoc, which means breaking BW compatibility of the API at the application level, which
>>> we simply cannot do.
>>>
>>> As I tried to outline earlier, what you want to achieve can be provided via UriBuilder.segment(...) method. This would
>>> preserve BW-compatibility with the 1.x API.
>>>
>>> One other option that comes to my mind is to add a new build(...) method that would take a flag indicating if the
>>> slashes in values should encoded for all templates in path component or not. Something like this:
>>>
>>> public abstract URI build(Object... values, boolean encodeSlashInPath) throws IllegalArgumentException,
>>> UriBuilderException
>>>
>>> The default behavior of build(Object... values) would correspond to build(values, false). The behavior you request could
>>> be achieved by build(values, true).
>>>
>>> If you see another solution that would be BW-compatible, and still not too complicated let me know.
>>>
>>> Marek
>>>
>>>>
>>>> Regards
>>>> Markus
>>>>
>>>>
>
>