Re: URI resolving going forward

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Tue, 13 Nov 2007 23:29:45 +0100

Hi Frank,

I have completed converting over to the new rules/rule match/accept
interfaces and impls in the branch. All unit tests pass, so it has fixed
that rather tricky bug of matching sub-methods with the same regexs but
with different template names. The end result is i think much cleaner
and more adaptable.

It should be fairly easy to convert the TrieUriPathResolver to say an
'AutomataMatchingUriTemplateRules' in the com.sun.ws.rest.impl.uri.rules
package, see the LinearMatchingUriTemplateRules. A test can be added by
copying the class LinearMatchingTest.

For now we can keep it to just zero or at most one match, like the
linear algorithm. Notice that there is no need to support a map of
template name to values or the right hand path (since this is just the
last optional capturing group, either '(/.*)?' or '(/)?'), so just a
list for the capturing group values is required. Not sure if this
simplifies things, i notice you are currently depending URI template
match functionality to refine the matching and get the template values,
perhaps when modified this is no longer required?

In any case we can start working on separating UriTemplateType from the
automata implementation by creating a simple regex model of a simple
regular expression. Such instances would be generated from the URI
template and additional runtime information like the @UriTemplate value
limited, or for supporting distinct URLs, for example ending in ".xml"
or ".json". The end result should be that UriRules implementations are
independent of the context of the patterns that have been created.

Actually it just occurred to me that we can simplify the UriRules
instance and remove the 'add' and 'getRules' methods, these can be part
of an abstract class. That way it should be possible to initialize
things correctly without having to rely on synchronization logic as you
currently require. I will wait until things have been converted over
before making this change.

Paul.

Paul Sandoz wrote:
> Hi,
>
> Below are proposed interfaces for abstracting out URI path processing
> along lines similar to Phobos.
>
> The basic looping is as follows:
>
> RuleContext c = ...
> Rules rules = ...
> Iterator<Rule> i = rules.match(path, c)
> while (i.hasNext()) {
> if (i.next().accept(path, c))
> break;
>
> or:
>
> RuleContext c = ...
> Rules rules = ...
> for (Rule rule : rules.match(path, c))
> if (rule.accept(path, c))
> break;
>
> This should enable matching optimization implementations, and a
> separation of a list of matching groups of a regex generated from an
> associated URI template (if any association).
>
> The efficient automaton algorithm works from limited regex expressions
> produced from the UriTemplateType. I think we can separate out the two
> by having a Regex class that provides a model (or complex regexes for
> other cases) produced from a URI template and the runtime. This should
> also enable us to better generate and support simple regexes for
> distinct URI matching that are generated from a canonical URI template.
>
> Paul.
>
> /**
> * A rule, which operates on a URI path.
> *
> * @author Paul.Sandoz_at_Sun.Com
> */
> public interface Rule {
> /**
> * Accept the rule.
> *
> * @param path the URI path
> * @param context the rule context
> * @return if true then the rule was accepted,
> * otherwise if false then the rule was
> * not accepted.
> */
> boolean accept(String path, RuleContext context);
> }
>
> /**
> * An ordered collections of rules.
> * <p>
> * Each rule is associated with a regular expression.
> * <p>
> * The collection of rules are matched against a URI path.
> *
> * @author Paul.Sandoz_at_Sun.Com
> */
> public interface Rules {
> /**
> * Add a rule to the end of the collection of existing rules,
> * and associate the rule with a regular expression.
> *
> * @param r, the rule associated with the regular expression.
> * @param regex, the regular expression to be used to matched.
> */
> void add(Rule r, Regex regex);
>
> /**
> * Iterate over the available matching rules.
> *
> * @param path, the URI path to be matched
> * @param context the rule context
> * @return an iterator of matching rules
> */
> Iterator<Rule> match(String path, RuleContext c);
> }
>
> /**
> *
> * @author Paul.Sandoz_at_Sun.Com
> */
> public interface RuleContext {
>
> HttpContextAccess getHttpContext();
>
> HttpRequestContext getHttpRequestContext();
>
> HttpResponseContext getHttpResponseContext();
>
> /**
> * Match and accept the sub rules associated with a class.
> *
> */
> boolean subRules(Class nodeClass, StringBuilder path);
>
> /**
> * Match and accept the sub rules associated with an object.
> *
> */
> boolean subRules(Object node, StringBuilder path);
>
> List<String> getMatchingGroups();
> }
>
> Paul Sandoz wrote:
>> Hi,
>>
>> Below are the proposed steps going forward for URI path resolving:
>>
>> 1) Change URI path resolving to work from a simple regular expression
>> model (generated from a URI template or otherwise). This is so we can
>> fix issue 1 [1] and can easily support distinct URIs with suffixes
>> associated with media type, language etc that we have been discussing
>> in the EG.
>>
>> 2) Integrate the automaton (trie) resolver into the trunk (keeping the
>> linear resolver in the trunk too for backup, perhaps we should have a
>> runtime option on which to use?) and ensure it works with changes
>> introduced for 1).
>>
>> 3) Change URI path resolving to the same model as Phobos. I have been
>> talking with Roberto (the Phobos guy) and he convinced me that the
>> model Phobos uses is more flexible than what Jersey currently has and
>> should result in more simplicity when we need to add more features.
>> Basically Phobos has a list of rules, each rule contains a regular
>> expression and an accepting function. The basic algorithm is as
>> follows:
>>
>> for rule in rules:
>> if rule.regexp.match(url) and rule.accept(url):
>> break
>>
>> So a match is just the first stage from which the rule can decide
>> whether it accepts the match or whether matching should continue.
>>
>> This allows us to have the following rules associated with a resource
>> class:
>>
>> - a head rule to dynamically detect view templates (like JSPs or
>> Velocity templates) for a resource class that are not explicitly
>> declared on the resource class. Same goes for static content;
>>
>> - two resource classes with the same URIs but supporting different
>> representations e.g. implement XML then implement Atom without
>> changing existing code. The first could return a not acceptable
>> response and therefore it is a non-accepting rule and it passes it
>> over to the next one; and
>>
>> - return a customized 404 response. A tail rule could return a
>> particular 404 response (or any other error-based response) for a
>> resource class.
>>
>>
>> To ensure existing code works we can integrate the UriPathResolver as
>> one rule. That way the trie algorithm will still work and then we
>> can think how to transition it over to a more general rule based
>> implementation.
>>
>>
>> Sound like a plan? Any opinions either way on this approach?
>>
>> Paul.
>>
>> [1] https://jersey.dev.java.net/issues/show_bug.cgi?id=1
>>
>

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109