dev@fi.java.net

Re: JAXB Hook for FI

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Wed, 11 May 2005 10:57:37 +0200

Kohsuke Kawaguchi wrote:
>> - Do you parse the qname string to obtain the prefix? If so i can pass
>> the prefix as a separate string as this will be more efficient for
>> JAXB and FI.
>
>
> We can always invent custom CharSequence implementations like
> Base64Data. For example we can have QNameData, and we can have our QName
> unmarshalling code recognize it as such.
>
> I still need QNameData to implement CharSequence so that it can be
> treated as literal PCDATA when necessary.
>

Are you talking about qnames in content?

I was referring to the methods:

     startElement(String nsUri, String localName, String qname,
        Attributes atts)

     endElement(String nsUri, String localName, String qname)

and the 'qname' parameter.


>> - For the text method would it be possible to pass in a character array
>> just like for SAX. The FI parser could wrap its character buffer
>> around a CharSequence impl but access to a direct array would be
>> faster. Although perhaps the CharSequence abstraction is used
>> throughout JAXB for parsing of content making such support difficult?
>
>
> Making it (char[],int,int) will make it impossible to overload typed
> CharSequence impls like Base64Data. So this was done deliberately.
>
> I think in a typical schema, a lot of datatypes are eventually bound to
> java.lang.String, so as long as CharSequence.toString() is efficient, it
> should be OK.
>
>> - We can have other methods for int[], float[] etc. All optimized data
>> types will be arrays even if only one value is present, will this be
>> an issue?
>
>
> I think it's fine. The unmarshaller for int can report an error if
> there's more than one entry in the array. Why don't we create
> IntArrayData for int[] and see if that works OK?
>
> I'll write one and send it for your review.
>

OK, i see the model you are using now. You do:

     if (pcdata instanceof Base64Data)

i presume in areas where there could be Base64Data because of an annotation.

I thought it would be more efficient to not have to go through the
specific data type classes and fields could be set directly if the
algorithm data corresponds to the Java type.


>> When i first looked at the 'expectText' i thought it meant to indicate
>> that text was expected instead of other forms of data like an array of
>> integer. But reading the JavaDoc this is not the case. Is the
>> expectText method most likely to return true for mixed content?
>> Perhaps it would be clearer to rename it 'expectWhiteSpaceText'?
>
>
> It will definitely return true for the mixed content, but usually false
> if the unmarshaller is expecting the element-only content model.
>
> It's not a method that returns true when it's expecting whitespace; it's
> a method that returns true if it's expecting text other than whitespace.
> In that sense I hope the current name is fine.
>
> The idea here is that in SAX, JAXB will have to spend a good amount of
> time copying and routing whitespace texts inside only to be thrown away
> in the end. Knowing in advance that we can ignore whitespace helps the
> performance.
>

So when there is a text event the parser can call expectText and if it
is false check if the characters are white space and if so skip.


I am wondering if it would be efficient to have a method that combines
an element event with a text event. Since for a lot of binding cases
this type of patter will occur:

<e>foo</e>
<e>bar</e>
<e>baz</e>


> I don't know if this applies to FI --- maybe it normally doesn't have
> whitespaces for indentation.
>

It will if it is in the infoset. Usually such strings will be small and
indexed so we could have a special flag to indicate white space so that
we do not need to keep rechecking. For SOAP messages there is unlikely
to be any indentation for efficiency reasons so i am not sure it is
worth the additional effort to cache whitespace information.

Paul.

-- 
| ? + ? = To question
----------------\
    Paul Sandoz
         x38109
+33-4-76188109