users@jsr311.java.net

Re: JAX-RS: _at_Path limited=false templates: (?!/).+(?<!/)

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Fri, 04 Jul 2008 07:57:42 +0200

On Jul 4, 2008, at 3:36 AM, Manger, James H wrote:

> Paul,
> > - URI matchers could be used as URI templates, although an
> unnamed variable cause issues
>
> A variable is never unnamed, but its name can be the empty string
> “”. An empty string does not need to be treated any differently
> from any other string when building a URI.

The rule of template construction means that two or more declarations
of the same template variable name must have the same value. Thus
assignment needs "undeclared" names need to be unique e.g. _1, _2
etc. I think having such "undeclared" names makes things a little
more complicated.


>
> > - What does it mean if the regex for a name contains one or more
> capturing groups?
>
> Accept them.
> At a minimum a JAX-RS implementation needs to count the groups so
> it can build its name-to-group mapping.
> An additional feature where a @*Param value can have the form
> “name.index” would be nice.
> It should be easy to implement:
> If (name)
> v = matcher.group(map.get(name));
> else if (name.index)
> v = matcher.group(map.get(name) + index);
>

Yes, i came to the same conclusion, it could have the same naming
scheme as undeclared names, which means that an undeclared name with
sub-groups could be named _1._1 and _1._2 :-) But prefer the
simplicity of not specifying this behvaiour even of sub-groups are
allowed.


> Paul,
> > name = 1*(< any TEXT expect <:>>)
>
> Need to exclude } from names as well, since the :regex bit is
> optional.
>

Yes. I was too lazy to restrict what "TEXT" should be :-)


> > regex = token | quoted-string
> > token = 1*(<any TEXT except <}> that is a regex>)
> > quoted-string = ( <"> 1*(<any TEXT except <"> that is a regex>)
> <"> )
>
> That is ok, but slightly worse that my 1st option I think.
> My 1st option forbids regexes with an unpaired { or }, or nested
> {{…}}s.
> Your syntax forbids regexes with both “ and }.
> I am happy with either restriction as I don’t believe any practical
> regex would be affected. You would have to deliberately design a
> diabolical regex to be affected.
> With my 1st option it is always {name:regex}, with your syntax an
> author has choose {name:regex} or {name:”regex”}.
> Both syntaxes complicate splitting a @Path value into literals and
> placeholders. However, the splitting can still be done with a
> pattern. I provided a 42-character pattern for my 1st option. The
> following pattern for your syntax is 45-characters long. => No
> significant difference in complexity.
> For yours (ignore spaces): \{ ([^:}]*) (?: : ( (?: ($!”) [^}]* ) |
> (?: “[^”]*” ) ) )? \}
>
>
>
> > what if there are literal characters present in the regex. Do
> those characters need to be percent encoded or not.
>
> NOT.
> encode=true/false currently does not apply to placeholders. That
> shouldn’t change even if a placeholder can contain a regex. As you
> say, parsing a regex is too hard.
>

What about literal characters like space that would be percent
encoded in the URI? Matching on the URI path needs to occur in
encoded space, namely on the URI path in raw form as obtained from
the HTTP request.

If we state that literal characters that would require percent
encoding to be part of the URI SHOULD be percent encoded in the regex
then we might simplify the algorithm to obtain the regex.

For example if we have this URI path:

   %20a%20b

then a matching regex needs to be:

   (%20?%20?)

rather than:

   ( ? ?)

This means that a literal '{' or '}' should be percent encoded. Thus
there should only be open and close of braces pairs (if any, nested
or otherwise) after the ':', and we don't require quoting. It also
means there can be white space between the start and end of the
regex. So we could have:

   {name: %20?%20 }

Paul.

> James Manger
>
> _____________________________________________
> From: Manger, James H
> Sent: Thursday, 3 July 2008 3:44 PM
> To: 'users_at_jsr311.dev.java.net'
> …
> 1) Only allow '{' and '}' chars in the regex as part of '{…}' pairs
> that are not nested.
> 2) Delimit the regex with ` or space (instead of : and }).
> 3) Disallow any '}' chars in the regex.
> 4) Invent an escaping mechanism.
> …
>