users@jsr311.java.net

Re: JAX-RS: _at_Path limited=false templates: (?!/).+(?<!/)

From: Marc Hadley <Marc.Hadley_at_Sun.COM>
Date: Thu, 03 Jul 2008 08:52:57 -0400

On Jul 3, 2008, at 1:44 AM, Manger, James H wrote:

> I like Bill Burke’s solution.
>
> Embedding a regular expression within a {…} placeholder requires a
> little care. Four options spring to mind (from my most preferred to
> least):
>
> 1) Only allow '{' and '}' chars in the regex as part of '{…}' pairs
> that are not nested. The restriction applies even if a char is
> escaped as \{ in the regex.
> {foo:X{2,3}} is allowed (matching XX or XXX)
> {foo:X[^}]Y} is not allowed
>
> 2) Delimit the regex with ` or space (instead of : and }). These
> chars have no special meaning in a regex, and are not allowed in URI
> paths (and don't need escaping in a Java String). Consequently they
> shouldn't be needed in the regex so they can be forbidden.
> {foo`.*`} or {foo".*"} or {foo .* }
>
> 3) Disallow any '}' chars in the regex.
>
I like 3. '}' is not allowed in a URI path unless its escaped so it
doesn't seem like a big restriction to me.

Marc.

> 4) Invent an escaping mechanism.
>
>
>
> Splitting a @Path value into literals and placeholders is a bit more
> complex with option 1, but still practical. The following pattern
> should match a placeholder, capturing the name and regex in groups 1
> & 2 respectively:
>
> \{ # start of placeholder
> ( [^:}]* ) # name (1st capturing group)
> (?: # start of optional :regex
> : # colon separating name and regex
> ( # start capturing regex
> (?:
> [^{}]* | # any chars other than braces; or
> \{[^{}]*\} # pair of braces
> )*
> ) # end of regex
> )? # end of optional :regex
> \} # end of placeholder
>
>
>
>
> 3.7.3 becomes:
> 1. Split the @Path value into literal strings and placeholders;
> 2. Build a regular expression by processing each item in turn:
>
> 2.1. For a literal string:
> 2.1a. If A.encode is true (or not defined),
> %-encode chars not in <reserved> or <unreserved>;
> 2.1b. Escape any regex special characters, then append the result;
>
> 2.2. For a placeholder with just a name:
> Append “([^/]+)” [I don't think the reluctant qualifier is
> necessary]
>
> 2.3. For a placeholder with a regex:
> Append “(”, the regex, and “)”;
>
> 3. At the end:
> 3.1. If the final char is a “/”,
> Append “(.*)”
> 3.2. Otherwise (the final char is not a “/”),
> Append “(?:/(.*))?”
>
>
> @Path("{foo:.*}") produces “(.*)(?:/(.*))?”. This acts a little like
> “.*.*”, which does not look like good regex practise. In practise,
> however, I don't think it is ambiguous, dangerous, or potentially
> horrible for matching performance (the first .* just gets everything).
>
> James Manger
>
>
> P.S. I cannot think of any sensible @Path value with encode=false
> that couldn't be rewritten fairly easily as an encode=true value. At
> worst you have to change %xx%yy to \uzzzz. Consequently, I suggest
> dropping the @Path encode parameter.
>
> _____________________________________________
> From: Bill Burke [mailto:bburke_at_redhat.com]
> Sent: Thursday, 3 July 2008 1:58 AM
> To: users_at_jsr311.dev.java.net
>
> I understand James's usecase, but I don't like his solution. I
> never liked the 'limited' annotation attribute and thought we should
> expand on @Path expressions instead. I propose supporting regular
> expressions instead. Here's my idea:
>
> "{}" denotes a PathParam, expression, or both:
>
> "{" [ path_param ] ":" expression "}" |
> "{" path_param "}"
>

---
Marc Hadley <marc.hadley at sun.com>
CTO Office, Sun Microsystems.