Paul Sandoz wrote:
>>- Do you parse the qname string to obtain the prefix? If so i can pass
>>>> the prefix as a separate string as this will be more efficient for
>>>> JAXB and FI.
>
> Are you talking about qnames in content?
>
> I was referring to the methods:
>
> startElement(String nsUri, String localName, String qname,
> Attributes atts)
>
> endElement(String nsUri, String localName, String qname)
>
> and the 'qname' parameter.
Ah. No, we never look at the prefix. The only use for this QName is for
building DOM. Passing this parameter is usually cheap for SAX, but maybe
it might be costly for FI.
It would be nice if we can pass in this information only when it's
necessary, maybe that's the kind of situation where the pull
unmarshaller performs better.
> i presume in areas where there could be Base64Data because of an annotation.
>
> I thought it would be more efficient to not have to go through the
> specific data type classes and fields could be set directly if the
> algorithm data corresponds to the Java type.
You can reuse the instance of those typed CharSequence, so when the
expectation and the actual data matches up, the cost is just setting to
this wrapper and getting from the wrapper, which I hope shouldn't be too
bad.
When the expectation and the actual data didn't match up, being able to
treat them all as CharSequence always help.
> So when there is a text event the parser can call expectText and if it
> is false check if the characters are white space and if so skip.
You don't even need to check the characters==whitespace if you don't
want to. We can silently ignore any misplaced text.
> I am wondering if it would be efficient to have a method that combines
> an element event with a text event. Since for a lot of binding cases
> this type of patter will occur:
>
> <e>foo</e>
> <e>bar</e>
> <e>baz</e>
This is a possibility that we should consider. On the first look,
however, if we ask the parser to recognize this pattern, that might be
costly enough to cancel any benefit.
For example, in SAX, to do this you need to hold off two events at
least, plus buffer copy, and you also need to copy an attribute for <e>
in case what you eventually see is <e>foo<e>...
>> I don't know if this applies to FI --- maybe it normally doesn't have
>> whitespaces for indentation.
>>
>
> It will if it is in the infoset. Usually such strings will be small and
> indexed so we could have a special flag to indicate white space so that
> we do not need to keep rechecking. For SOAP messages there is unlikely
> to be any indentation for efficiency reasons so i am not sure it is
> worth the additional effort to cache whitespace information.
>
> Paul.
>
--
Kohsuke Kawaguchi
Sun Microsystems kohsuke.kawaguchi_at_sun.com