Re: [Jersey] Shouldn't Jersey decode _at_Path before matching the regex?

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Mon, 05 Jan 2009 17:48:17 +0100

On Jan 5, 2009, at 5:05 PM, Gili wrote:

>
> "The reason is that a regex parser would be required to support such
> contextual encoding of regex expressions"
>
> What does that mean?

It means that i need something beyond the regex support supplied by
the JDK such that i can get access to the tokenized regex expression
to determine sequence of literal characters. In my last email i meant
to say "not an insignificant thing", so we are in agreement :-)

It is not possible to decode before matching because you loose
information. Matching must happen in encoded space. For example what
if a path segment of the request URI has '/' characters that are
percent encoded? e.g. if another URI is embedded in a path segment.

Paul.

> You expect Sun's regex engine to somehow figure out
> that %20 is a whitespace character? That seems highly unlikely :)
> For one,
> there is no way to provide a regex engine with a context. As for
> being easy
> to implement I don't see how. It would be far easier to decode the
> text
> before running it through a regex engine than trying to modify the
> regex
> expression to take encoding into consideration. Things like \s are an
> obvious example of that. Instead of using \s in the regex I'd have to
> manually specify 4-5 different characters every time. That doesn't
> help
> readability either...
>
> Gili
>
>
> Paul Sandoz wrote:
>>
>> URI template variable specifications are ignored:
>>
>> https://jsr311.dev.java.net/nonav/releases/1.0/spec/
>> spec3.html#x3-370003.7.3
>>
>> The reason is that a regex parser would be required to support such
>> contextual encoding of regex expressions and that is not an
>> significant thing to implement or reuse.
>>
>> Paul.
>>
>> On Dec 29, 2008, at 7:53 PM, Gili wrote:
>>
>>>
>>> Say I want to match the following path: "http://example.com/a b"
>>>
>>> @Path("{tag:a b}") fails but
>>> @Path("{tag:a%20b}") works
>>>
>>> This is a problem because it means I can't use \\s in the regular
>>> expression, allowing me to match arbitrary whitespace. Shouldn't
>>> Jersey be
>>> decoding the path before running the regex against it?
>>>
>>> According to the specification page 12: "The value of the annotation
>>> is
>>> automatically encoded" page 34 "Encoded: Disables automatic URI
>>> decoding for
>>> path, query, form
>>> and matrix parameters" which implies that URI decoding should be
>>> taking
>>> place.
>>>
>>> Gili
>>> --
>>> View this message in context:
>>> http://n2.nabble.com/Shouldn%27t-Jersey-decode-%40Path-before-matching-the-regex--tp2089825p2089825.html
>>> Sent from the Jersey mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
>>> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
>> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>>
>>
>>
>
> --
> View this message in context: http://n2.nabble.com/Shouldn%27t-Jersey-decode-%40Path-before-matching-the-regex--tp2089825p2113472.html
> Sent from the Jersey mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_jersey.dev.java.net
> For additional commands, e-mail: users-help_at_jersey.dev.java.net
>