users@fi.java.net

Re: VTD speed

From: Tatu Saloranta <cowtowncoder_at_yahoo.com>
Date: Tue, 24 Jan 2006 22:53:45 -0800 (PST)

--- Paul Sandoz <Paul.Sandoz_at_Sun.COM> wrote:

...
> > I guess VTD-XML and SAX each has its pros and
> cons,
>
> Yes, that is it.

Yes, and also various DOM approaches (esp. something
like XOM that "does the right thing", without having
to sacrifice too much of performance, and none of
convenience). Right tool for the job.

> > direct apple-to-apple comparison is hard ...
>
> Perhaps. I think it would be useful to compare for
> two cases when
> processing XML documents (with one or more
> namespaces):
>
> 1) Where the whole infoset needs to be processed,
> e.g. for binding; and
>
> 2) Where some part of the infoset needs to be
> processed e.g. executing
> some XPath expression.

Also, what might be very useful use case would be
doing simple content replacement. If VTD approach
lends itself well to efficient sub-tree replacements
(which is definitely possible), it would be nice to
see an example of replace, say, all instance of
specific element and its contents with something else,
and measure performances of various approaches.

> IMHO comparing just VTD parsing SAX and StAX parsing
> is not enough to
> show that VTD is faster. Because of the way VTD
> works the benchmarks
> need to do something with what is parsed. I think
> that this would better
> show VTDs (and other models) strengths and
> weaknesses.

Yes. Same way as Xerces deferred node construction, or
Stax lazy parsing (if text is not accessed during
parsing, stax can avoid allocating any non-buffer
memory for text segment, for example); there are some
good optimizations that may skew results of certain
types of tests.

-+ Tatu +-


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com