I like Bill Burke’s solution.
Embedding a regular expression within a {…} placeholder requires a little care. Four options spring to mind (from my most preferred to least):
1) Only allow '{' and '}' chars in the regex as part of '{…}' pairs that are not nested. The restriction applies even if a char is escaped as \{ in the regex.
{foo:X{2,3}} is allowed (matching XX or XXX)
{foo:X[^}]Y} is not allowed
2) Delimit the regex with ` or space (instead of : and }). These chars have no special meaning in a regex, and are not allowed in URI paths (and don't need escaping in a Java String). Consequently they shouldn't be needed in the regex so they can be forbidden.
{foo`.*`} or {foo".*"} or {foo .* }
3) Disallow any '}' chars in the regex.
4) Invent an escaping mechanism.
Splitting a @Path value into literals and placeholders is a bit more complex with option 1, but still practical. The following pattern should match a placeholder, capturing the name and regex in groups 1 & 2 respectively:
\{ # start of placeholder
( [^:}]* ) # name (1st capturing group)
(?: # start of optional :regex
: # colon separating name and regex
( # start capturing regex
(?:
[^{}]* | # any chars other than braces; or
\{[^{}]*\} # pair of braces
)*
) # end of regex
)? # end of optional :regex
\} # end of placeholder
3.7.3 becomes:
1. Split the @Path value into literal strings and placeholders;
2. Build a regular expression by processing each item in turn:
2.1. For a literal string:
2.1a. If A.encode is true (or not defined),
%-encode chars not in <reserved> or <unreserved>;
2.1b. Escape any regex special characters, then append the result;
2.2. For a placeholder with just a name:
Append “([^/]+)” [I don't think the reluctant qualifier is necessary]
2.3. For a placeholder with a regex:
Append “(”, the regex, and “)”;
3. At the end:
3.1. If the final char is a “/”,
Append “(.*)”
3.2. Otherwise (the final char is not a “/”),
Append “(?:/(.*))?”
@Path("{foo:.*}") produces “(.*)(?:/(.*))?”. This acts a little like “.*.*”, which does not look like good regex practise. In practise, however, I don't think it is ambiguous, dangerous, or potentially horrible for matching performance (the first .* just gets everything).
James Manger
P.S. I cannot think of any sensible @Path value with encode=false that couldn't be rewritten fairly easily as an encode=true value. At worst you have to change %xx%yy to \uzzzz. Consequently, I suggest dropping the @Path encode parameter.
_____________________________________________
From: Bill Burke [mailto:bburke_at_redhat.com]
Sent: Thursday, 3 July 2008 1:58 AM
To: users_at_jsr311.dev.java.net
I understand James's usecase, but I don't like his solution. I never liked the 'limited' annotation attribute and thought we should expand on @Path expressions instead. I propose supporting regular expressions instead. Here's my idea:
"{}" denotes a PathParam, expression, or both:
"{" [ path_param ] ":" expression "}" |
"{" path_param "}"