users@jaxb.java.net

Re: marshal using a pull model

From: Kohsuke Kawaguchi <kohsuke.kawaguchi_at_sun.com>
Date: Tue, 06 Dec 2005 09:55:51 -0800

Nico Snosloski wrote:
> Hello Experts,
>
> I'd like to be able to marshal a JAXB2 bound object incrementally into a
> proprietary format using a pull model, similar to the way in which an
> XMLStreamReader gives you control over the parsing an XML document. I also have
> the added requirements of a) accessing the corresponding JAXB bound object and
> b) skipping a sub tree while processing. I'm currently using a ContentHandler
> to accomplish this, but this has the following downsides for my use case:

This is one of the things that I wish I incorporated into the initial
design. Just like you can't wrap SAX parser into a StAX parser (at least
in a performing manner), we can't wrap a push marshaller into a pull
marshaller.

If you don't mind some performance loss, you can always use two threads
with Piped(Reader|Writer) (or Piped(Input|Output)Stream), but sounds
like you do care about the performance.

It's a difficult design choice, as the model that retains execution
control seems to be faster in practice (that means pull is faster for
unmarshalling, push is faster for marshalling), and pull model is
necessary only in a limited circumstance in the marshaller.

> 1) the entire binding hierarchy (from the root starting point) is traversed,
> which can inefficient in my case because it forces me to generate my proprietary
> format for the entire tree while only parts of it will be used. I would like to
> traverse a sub tree of the binding hierarchy only when necessary, with the
> ability to skip a sub tree if the information within it is not needed.

I see. The othe possibility for you might be to use multiple
Marshaller.marshal() method invocations and compose a bigger tree from
small pieces.

> 2) the corresponding JAXB2 object is difficult to get to from my ContentHandler
> callbacks. I'm using a beforeMarshaller callback which solves most of this
> problem, but it is not called for built-in types, so some hacks were necessary
> to overcome this.

Indeed. Matching up objects with XML infoset events are often difficult.

> Ideally, I'd like to have a tree iterator with which I could walk the JAXB2
> bound objects while skipping subtrees when necessary. I'd need to be able to
> retrieve the runtime type information at each point as well (i.e. qname,
> attribute/element). Is there something like this already available? If not, is
> my best bet to use the reflection library API? Any more example usages of this
> that could help guide me? How will the performance of this API compare to my
> existing ContentHandler usage (assuming a full traversal in both cases)?

I'm sympathetic to all the use cases you described, but it requires
substantial change to the way the marshaller works. We also have to
invstigate the performance impact that all those hooks will introduce, too.

It's probably possible to create an alternative marshaller
implementation by using the reflection library, with all the hooks you
need. It's not quite trivial, but hopefully it won't be too difficult,
either.

>
> Thanks in advance for any advice. Regards,
>
> -Nico
>
> --------------------------------------------------------------------------------
> Yahoo! FareChase - Search multiple travel sites in one click.
> <http://us.lrd.yahoo.com/_ylc=X3oDMTFqODRtdXQ4BF9TAzMyOTc1MDIEX3MDOTY2ODgxNjkEcG9zAzEEc2VjA21haWwtZm9vdGVyBHNsawNmYw--/SIG=110oav78o/**http%3a//farechase.yahoo.com/>
>


-- 
Kohsuke Kawaguchi
Sun Microsystems                   kohsuke.kawaguchi_at_sun.com