Re: URI Escaping and Unescaping

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Thu, 21 Jun 2007 16:12:11 +0200

Julian Reschke wrote:
> Hi,
> I think this kind of proves that the URI template internet draft needs
> to be finished, and then, the relation between JSR-311's templates and
> the ones described by the IETF spec needs to be clarified.


> Right now,
> <>
> seems to speak about URIs, nothing else. So you can't have an unescaped
> blank space, nor non-ASCII characters. It seems to me that this is what
> the current RI should implement.

That seems reasonable.

> Which also means that unescaping of templated values needs to be done in
> a separate step. That may be a bit ugly, but I really prefer that in
> comparison to messing around with the template format.

Just because the template requires escaping (for 'conformance' reasons)
does not mean the template values accessed by the developer need be. A
similar case can be made for annotations that require a conformant URI,
the path can be accessed as a decoded values (URI.getPath).

Note that it is possible to support types other than String [1] with
@UriParam, thereby requiring decoding:

   Binds a method parameter, class field or property to a URI template
   parameter value. The class of the annotated parameter, field or
   property must have a constructor that accepts a single String
   argument, or a static method named valueOf that accepts a single
   String argument (see, for example, Integer.valueOf(String)).

If we don't decode then a developer will spend time debugging a problem
to find that funny characters are present in the template values. On the
hand if we do decode a developer will spend time debugging URI creation
exceptions. Both are a source of nasty sleeper bugs :-(

However, I would think it highly likely that if a developer uses
UriParam or QueryParam that they want to do something useful with it
(e.g. as a DB key or SQL query) thus decoding will have to be done and
to tell the developer that they have to do this seems contrary to a
developer-friendly API. In either case the decoded value might be used
as part of URI creation...

IMHO I think we need to investigate techniques for URI creation and
manipulation given the knowledge that the developer is likely to prefer
working with decoded values (while not pulling the rug from under the
developers that need to work with escaped values). For example, we could
expose a URI template class that supports the parsing and creation of
URIs, it may be possible to get a URI template from the current resource
class (on itself or the ones for the next matches e.g. to use for URI
creation) etc.



> Best regards, Julian
> (*) we could use IRI templates, instead of URI templates, but then we'll
> still have to take care of characters not allowed in IRIs, such as SP,
> "{", "}", "/" and so on...

| ? + ? = To question
    Paul Sandoz