users@jsr311.java.net

RE: JAX-RS: _at_Path limited=false templates: (?!/).+(?<!/)

From: Manger, James H <James.H.Manger_at_team.telstra.com>
Date: Thu, 3 Jul 2008 15:44:27 +1000

I like Bill Burke’s solution.

Embedding a regular expression within a {…} placeholder requires a little care. Four options spring to mind (from my most preferred to least):

1) Only allow '{' and '}' chars in the regex as part of '{…}' pairs that are not nested. The restriction applies even if a char is escaped as \{ in the regex.
  {foo:X{2,3}} is allowed (matching XX or XXX)
  {foo:X[^}]Y} is not allowed

2) Delimit the regex with ` or space (instead of : and }). These chars have no special meaning in a regex, and are not allowed in URI paths (and don't need escaping in a Java String). Consequently they shouldn't be needed in the regex so they can be forbidden.
  {foo`.*`} or {foo".*"} or {foo .* }

3) Disallow any '}' chars in the regex.

4) Invent an escaping mechanism.



Splitting a @Path value into literals and placeholders is a bit more complex with option 1, but still practical. The following pattern should match a placeholder, capturing the name and regex in groups 1 & 2 respectively:

\{ # start of placeholder
  ( [^:}]* ) # name (1st capturing group)
  (?: # start of optional :regex
    : # colon separating name and regex
    ( # start capturing regex
      (?:
        [^{}]* | # any chars other than braces; or
        \{[^{}]*\} # pair of braces
      )*
    ) # end of regex
  )? # end of optional :regex
\} # end of placeholder




3.7.3 becomes:
1. Split the @Path value into literal strings and placeholders;
2. Build a regular expression by processing each item in turn:

2.1. For a literal string:
2.1a. If A.encode is true (or not defined),
       %-encode chars not in <reserved> or <unreserved>;
2.1b. Escape any regex special characters, then append the result;

2.2. For a placeholder with just a name:
     Append “([^/]+)” [I don't think the reluctant qualifier is necessary]

2.3. For a placeholder with a regex:
     Append “(”, the regex, and “)”;

3. At the end:
3.1. If the final char is a “/”,
     Append “(.*)”
3.2. Otherwise (the final char is not a “/”),
     Append “(?:/(.*))?”


@Path("{foo:.*}") produces “(.*)(?:/(.*))?”. This acts a little like “.*.*”, which does not look like good regex practise. In practise, however, I don't think it is ambiguous, dangerous, or potentially horrible for matching performance (the first .* just gets everything).

James Manger


P.S. I cannot think of any sensible @Path value with encode=false that couldn't be rewritten fairly easily as an encode=true value. At worst you have to change %xx%yy to \uzzzz. Consequently, I suggest dropping the @Path encode parameter.

_____________________________________________
From: Bill Burke [mailto:bburke_at_redhat.com]
Sent: Thursday, 3 July 2008 1:58 AM
To: users_at_jsr311.dev.java.net

I understand James's usecase, but I don't like his solution. I never liked the 'limited' annotation attribute and thought we should expand on @Path expressions instead. I propose supporting regular expressions instead. Here's my idea:

"{}" denotes a PathParam, expression, or both:

"{" [ path_param ] ":" expression "}" |
"{" path_param "}"