users@jax-rs-spec.java.net

[jax-rs-spec users] [jsr339-experts] Re: HEADS-UP: Encoding values of UriBuilder template parameters

From: Markus KARG <markus_at_headcrashing.eu>
Date: Wed, 7 Dec 2011 19:55:24 +0100

> >>> Actually I have to disagree to your conclusion! I do not see any
> >> difference in path parameters vs. segment parameters.
> >>
> >> Path parameter is something that may contain a path (including
> slash).
> >> Unlike path, a path segment, from the URI spec is something that
> >> cannot contain slash. IOW, path parameter, path segment parameter as
> >> well as e.g. query or schema parameters have all different encoding
> >> schemes as they all define different sets of allowed characters.
> >>
> >> I think your fault is that you think encoding is depentent of the
> >> method, but it is solely dependend of the target.
> >
> > No the problem is that the RFC does not name a segment as being a
> standalone part of an URI, but says a segment is part of the path part.
> There is nothing like a "segment" part of an URI. So I understand what
> you like to tell, but it is just wrong to say that it is dependent of
> the component, as a segment IS PART OF THE PATH component. As a
> conclusion, as there is no segment component, the sole difference is
> THE METHOD named either path() or segment(), as segment() IS WRITING
> THE COMPONENT NAMED "PATH". Got the point?
>
> Why would we need to constrain parameter encoding to standalone spec
> components only? We support segment as a separate parameterized target
> as part of the URI builder API. It's IMHO only natural to expect that
> as part of this support we will also support the proper encoding for
> this parameterized target, right?

Sorry but it is just NOT natural. People are used to split URIs into components known as scheme, host, port, path etc. just as those are supported by Java's URI class. But people are not used to the term "segment" necessarily. So you technically can do that, but please do not expect anybody to understand that segment() will encode differently than path() unless you clearly and explicitly tell how that works in the JavaDocs. To support my thesis of "non-naturality" please check this excerpt from the URI JavaDocs:

All told, then, a URI instance has the following nine components:

Component Type
scheme String
scheme-specific-part String
authority String
user-info String
host String
port int
path String
query String
fragment String

I do not see something like "segment" here, so you HAVE to explicitly explain the way it works and what the difference to path() is.

> >> Exactly. :) To me, path and path segment are different targets for
> >> the reasons explained above.
> >
> > I understand how you interpret it, but the WORDS OF THE RFC do NOT
> say that segment is a standalone entity but it is PART OF PATH (what
> actually do you not understand with that?). So it is NOT a differenct
> target, so we have to change the JavaDocs. Our docs are not bound to
> your interpretation but have to match the clear words of the RFC.
>
> I disagree. Words of spec do not say that path segment is a separate
> URI component. But the spec clearly recognizes path segment as a
> separate entity (even as part of the URI grammar):
>
> <quote>
> A path consists of a sequence of path segments separated by a slash
> ("/") character.
> </quote>

Are you joking? A segment is an entity but not a component. RFC 2396 clearly names the set of components to be:

<first>/<second>;<third>?<fourth>

alternatively

<scheme>://<authority><path>?<query>

and it lists the *components* clearly in the chapters list as to be exactly (not including anything named "segment"):

3.1. Scheme Component

3.2. Authority Component

3.3. Path Component -- This chapter includes the term "segment" just to explain how a path can be splitted into logical pieces but I doubt that all people will actually are so bright necessarily to see from the first look that it is the justification for having different encoding behaviour of *parameter values* (in contrast to *parameter definitions*).

3.4. Query Component

> I do understand your reasoning now, but to me, we need to primarily
> consider what we support in our API. In our API we clearly recognize
> segment as a separate entity. UriBuilder.path("{template}") and
> UriBuilder.segment("{template}") simply mean different things IMO, just
> as UriBuilder.path("a/b") and UriBuilder.segment("a/b") do.

Right. I just say that you must explicitly and unamiguously must explaint in the JavaDocs at one single place how a *parameter value* will be encoded. Nothing else do I ask for. Replace the rather abstract phrase and write the exact resolution for each of that *parameter definition methods* (as it is *not* a URI component, remember?).

> >> And the target of both, path and segment, is the URI's path -- so
> >> there is no difference (while there is one for the query part,
> obviously)!
> >>
> >> I disagree. Look at the URI spec and check the definition of path
> and
> >> path segment.
> >
> > Sorry but, do YOU please look at the definition of which components a
> URI breaks up. It is clearly told that there is NOTHING like a
> "segment" component, but ONLY a path, which splits up into segments.
> But a segment IS A PATH, not a standalone normative target.
>
> Hmm, segment is most certainly NOT a path. Otherwise using this twisted
> logic I would have to conclude that if segment is a PATH then segment
> is a standalone URI component, which is an obvious fallacy.

There is nothing twisted in this logic but it is the exact definition of RFC 2396. A standalone segment IS a path. See how this screws your argumentation about encoding being bound to components? It is solely bound to the *parameter definition method.

> Similarly, segment is clearly defined in URI grammar. What makes you
> think that it is not a valid target of a URI-related API? Is query
> parameter not a valid part of the API too? It is certainly not part of
> the SPEC, worse there is no grammar in the spec that would define a
> query parameter...

Almost anybody would expect that "target" means "component", and the "component" of a segment is PATH. It is a valid target in the technical sense but it is ambiguous to the reader. Yes, there is nothing like a query parameter by good reason, so a query parameter cannot be a valid target. Maybe the query syntax is totally fancy. So a valid target can be *any* part of the query, like replacing anything right in the middle of a string. The assumption that a query has parameters is invalid.

> I'd say that the argumentation based on supporting purely standalone
> URI components as part of a URI-based API does not hold. Both - query
> parameter as well as path segment - are useful and valid URI targets
> for the purpose of UriBuilder API and as such we need to treat them as
> first-class citizens, including support for proper independent template
> value encoding.

I never said it is not useful. I said you must clearly explain the encoding by giving examples for all of the parameter definition methods. Nothing else. You mix up things. I never asked to reduce the functionality. I only demand unambiguous JavaDocs. What is so hard to understand with that?

>
> Also, please consider the following examples:
>
> UriBuilder.from("http://example.com/").userInfo("a/b").path("a/b").segm
> ent("a/b").build();
> UriBuilder.from("http://example.com/").userInfo("{t1}").path("{t2}").se
> gment("{t3}").build("a/b", "a/b", "a/b");
> UriBuilder.from("http://example.com/").userInfo("{t1}").path("{t1}").se
> gment("{t1}").build("a/b");
>
> IMHO they all mean the same thing and should produce the same URI:
>
> http://a%2Fb@example.com/a/b/a%2Fb

See, this is the difference in our view: Our team would expect DIFFERENT results by good reason. That there is a difference between parameter definition and parameter provision, as you are not building a pure string API. path("a/b").build("") and path("{t1}").build("a/b") shall behave different. The first shall render as "a/b" (as the slash is in the parameter definition), the second shall render as "a%2Fb" (as the slash is in the parameter value). Just in analogy to the JDBC API makes a difference between quotes in the parameter definition and quotes in the parameter value. In JDBC, quotes in the parameter definition are literals, but quotes in the parameter provision get escaped. This makes lots of sense, is totally logical, and should be applied to JAX-RS instead of your self-extended type of addition URI parts idea. BTW, we do not need segment() actually: We can just use path().path().path() to build the same.

Got my idea?

Regards
Markus