Re: JAXB Hook for FI

From: Kohsuke Kawaguchi <Kohsuke.Kawaguchi_at_Sun.COM>
Date: Tue, 10 May 2005 10:53:40 -0700

Paul Sandoz wrote:
> Hi,
>
> Sorry, it has taken me far too long to respond to this mail.

No problem.

> One of the general problems we will have to tackle is the push me pull
> me creature that is JAX-RPC + JAXB. JAX-RPC processing of SOAP messages
> is done using a pull model. I could write a rather odd parser that
> allows for a pull model via StAX for JAX-RPC to process the SOAP message
> infoset and a push model for JAXB to process the content of the SOAP
> message (header blocks, body and fault detail children). In some
> respects it all depends on how separate we can be for XML and FI
> processing. It may be good to reuse all SOAP message infoset level
> processing but if the code is not that much then specific access to SOAP
> infoset produce from FI using a very simple pull API may only be
> required + such an API will never be exposed to developers.
>
>
> On the specifics about the proposed interface:
>
> - Do you parse the qname string to obtain the prefix? If so i can pass
> the prefix as a separate string as this will be more efficient for
> JAXB and FI.

We can always invent custom CharSequence implementations like
Base64Data. For example we can have QNameData, and we can have our QName
unmarshalling code recognize it as such.

I still need QNameData to implement CharSequence so that it can be
treated as literal PCDATA when necessary.

> - For the text method would it be possible to pass in a character array
> just like for SAX. The FI parser could wrap its character buffer
> around a CharSequence impl but access to a direct array would be
> faster. Although perhaps the CharSequence abstraction is used
> throughout JAXB for parsing of content making such support difficult?

Making it (char[],int,int) will make it impossible to overload typed
CharSequence impls like Base64Data. So this was done deliberately.

I think in a typical schema, a lot of datatypes are eventually bound to
java.lang.String, so as long as CharSequence.toString() is efficient, it
should be OK.

> - We can have other methods for int[], float[] etc. All optimized data
> types will be arrays even if only one value is present, will this be
> an issue?

I think it's fine. The unmarshaller for int can report an error if
there's more than one entry in the array. Why don't we create
IntArrayData for int[] and see if that works OK?

I'll write one and send it for your review.

> When i first looked at the 'expectText' i thought it meant to indicate
> that text was expected instead of other forms of data like an array of
> integer. But reading the JavaDoc this is not the case. Is the expectText
> method most likely to return true for mixed content? Perhaps it would be
> clearer to rename it 'expectWhiteSpaceText'?

It will definitely return true for the mixed content, but usually false
if the unmarshaller is expecting the element-only content model.

It's not a method that returns true when it's expecting whitespace; it's
a method that returns true if it's expecting text other than whitespace.
In that sense I hope the current name is fine.

The idea here is that in SAX, JAXB will have to spend a good amount of
time copying and routing whitespace texts inside only to be thrown away
in the end. Knowing in advance that we can ignore whitespace helps the
performance.

I don't know if this applies to FI --- maybe it normally doesn't have
whitespaces for indentation.

-- 
Kohsuke Kawaguchi
Sun Microsystems                   kohsuke.kawaguchi_at_sun.com

application/x-pkcs7-signature attachment: S/MIME Cryptographic Signature