Re: JAX-RS: UriBuilder handling _at_Path values and placeholder regexes

From: Marc Hadley <Marc.Hadley_at_Sun.COM>
Date: Fri, 01 Aug 2008 08:28:50 -0400

On Aug 1, 2008, at 3:09 AM, Manger, James H wrote:
> > UriBuilder.fromPath("a").segment("b/{foo:.+}").build("c/d")
> >
> > I think the builder should do context-sensitive encoding of the
> value before doing anything with the regex so you'd get:
> >
> > a/b%2fc%2fd
>
> > UriBuilder.fromPath("a").path("b/{foo:.+}").build("c/d")
> >
> > would yield
> >
> > a/b/c/d
>
> I agree with the above behaviour.
>
> > I don't see why path(@Path("{foo}")) should be treated differently
> to path("{foo}").
>
> I assume ub.path(“{foo}”).build(“c/d”) succeeds, yielding “c/d”. [I
> hope I don’t need to write ub.path(“{foo:.+}”).]
>
It would yield c/d currently since we don't validate values against
the regex (the default when not specified disallows '/').

>
> If ub.path(@Path(“{foo}”)).build(“c/d”) also yields “c/d” it would
> then have to throw an exception as “c/d” does not match the default
> regex of [^/]+ for {foo} in a @Path.

The default regex is the same regardless of where the template comes
from. It would be inconsistent if the latter yielded an exception when
the former worked.

> If would be better if it yields “c%2Fd” which does match the default
> regex so the call succeeds.
>
That would be the case if you used segment rather than path.

>
> Consequently path(@Path) should be treated differently to
> path(String) to avoid unnecessary exceptions. That is, so callers do
> NOT have to do manual %-escaping of ‘/’s to avoid exceptions when
> UriBuilder uses path(@Path).
>
But as we've discussed that would either require the runtime to parse
the regex and formulate escaping rules to match or only understand a
fixed set of regex which I think would only lead to frustration down
the line.

> One issue is that the default regex [^/]+ is appropriate for @Path
> where it is defined, but it is not appropriate for a template passed
> to:
> * path(String) – which should allow ‘/’s;
> * queryParam(name,…) – where ‘=’ and ‘&’ are special, but ‘/’ is no
> different than other reserved chars;
> * queryParam(…,value) – where ‘&’ is special, but ‘/’ is no
> different than other reserved chars;
> * …
>
True, but the alternative of a context-specific default is getting
rather complicated I think. I think the status quo of no validation of
values is better.

> > I'd also prefer to validate against the regex rather than
> requiring the implementation to parse the regex.
>
> It is not 1) validate or 2) parse.
> It is validate, and either 1) parse or 2) throw unnecessary
> exceptions.
> Or rephrasing: it is validate, and either 1) parse or 2) require
> callers to manually %-escape values based on a tight coupling to the
> template (to know what to %-escape).
>
> > …I'm actually OK with the status quo where the onus falls on the
> developer
> > to supply values consistent with the path vs path segment
> distinction.
>
> I feel that almost defeats the purpose of a URI template. If the
> party supplying the variable values has to know how they are going
> to be put in to the URI so it can %-escape them accordingly, then it
> may as well construct the URI by passing the values directly to host/
> path/queryParam/fragment methods, instead of passing them to build(…).

I disagree. The only issue here is whether to escape '/' in path
components, . It doesn't seem too onerous to me to expect a developer
to know whether a particular parameter is a full path or just a segment.

> The benefit of a URI Template is that a template author can design
> the URIs how they want (and change that choice), independently of
> the template user who supplies variable values. The author and user
> have to agree on the names and semantics of the variables – but that
> should be all.
>
> > its too onerous to require implementations to be able to parse an
> arbitrary regex and formulate an encoding scheme to match.
>
> I agree.
>
> Fundamentally, UriBuilder needs to know which of the 18 reserved
> chars to %-escape for each placeholder. That is easy to determine if
> the only allowed regexes are [^xyz…]* or [^xyz…]+, where x, y, z…
> are any of the 18 reserved chars.
>
> Perhaps, instead of {name[:regex]}, the placeholder syntax should be
> {name[|reserved]} where the reserved chars that are allowed are
> explicitly listed.
>
That would defeat the whole purpose of allowing a regex which was to
allow more sophisticated URI matching. Escaping/validation is a side
issue.

Marc.

>
> _____________________________________________
> From: Manger, James H
> Sent: Monday, 28 July 2008 9:21 AM
> To: 'users_at_jsr311.dev.java.net'
> Subject: JAX-RS: UriBuilder handling @Path values and placeholder
> regexes
>
> …
> “For each placeholder in a @Path value:
> 1. If there is no regex, treat it as a segment placeholder (%-
> escape ‘/’s);
> 2. If the regex is “.*” or “.+”, treat it as a path
> placeholder (don’t escape ‘/’s);
> 3. For any other regex, %-escape all non-unreserved characters.
> A JAX-RS implementation MAY relax the 3rd rule above by not %-
> escaping characters that it knows are allowed by the regex. An
> application should not rely on a JAX-RS implementation to recognize
> such situations.
>
> A UriBuilderException shall be thrown if a placeholder value (after
> %-escaping) does not match the regex given in a @Path placeholder.”
>
> _____________________________________________
> From: Marc.Hadley_at_Sun.COM [mailto:Marc.Hadley_at_Sun.COM]
> Sent: Friday, 1 August 2008 1:24 AM
> To: users_at_jsr311.dev.java.net
>
> >> UriBuilder.fromResource( @Path(“widget/{id}/info”) ).build(“X/
> 123”)
> >> 1. “widget/X/123/info”; or
> >> 2. “widget/X%2F123/info”
>
> Currently 1.
>
>
> I don't see why path(@Path("{foo}")) should be treated differently
> to path("{foo}"). In the ideal case either method should escape the
> value of foo according to its regex (either explicit or implicit)
> not based on where the template comes from.
>
>
> I'd rather rename fromResource to fromPath so that the method name
> reflects the URI component being affected, but I think the current
> naming is also fine.
>
>
> I'd also prefer to validate against the regex rather than requiring
> the implementation to parse the regex. Encoding based on specific
> regex values is bound to cause issues when a developer starts to use
> more specific regex values and I think its too onerous to require
> implementations to be able to parse an arbitrary regex and formulate
> an encoding scheme to match.
>
> That said, I'm actually OK with the status quo where the onus falls
> on the developer to supply values consistent with the path vs path
> segment distinction.
> _____________________________________________
> From: Marc.Hadley_at_Sun.COM [mailto:Marc.Hadley_at_Sun.COM]
> Sent: Friday, 1 August 2008 6:27 AM
> To: users_at_jsr311.dev.java.net
>
> A related issue is what to do with:
>
> UriBuilder.fromPath("a").segment("b/{foo:.+}").build("c/d")
>
> I think the builder should do context-sensitive encoding of the
> value before doing anything with the regex so you'd get:
>
> a/b%2fc%2fd
>
> rather than
>
> a/b%2fc/d
>
> whereas
>
> UriBuilder.fromPath("a").path("b/{foo:.+}").build("c/d")
>
> would yield
>
> a/b/c/d
>
> Marc.
>

---
Marc Hadley <marc.hadley at sun.com>
CTO Office, Sun Microsystems.