users@jaxb.java.net

Re: best way to ignore DTD?

From: Kohsuke Kawaguchi <Kohsuke.Kawaguchi_at_Sun.COM>
Date: Tue, 27 May 2003 08:49:55 -0700

Andrew Ferguson <Andrew.Ferguson_at_arm.com> wrote:
> hi,
>
> I have an application that needs to process many xml files quickly. At the
> moment they all point to one of two DTDs on the web, one of which doesn't
> exist any more. This causes two problems
>
> 1) JAXB seems to fetch the DTD for each file processed, which slows
> everything down quite a lot
> 2) XML files that point at the non-existent DTD file fail to be parsed even
> if validation is disabled
>
> to get round this I've added the following temporary hack

...

> This works (and is quite a lot faster than letting the DTD be fetched) but
> I was wondering if there is a more straightforward way to do this?
> thanks,
> Andrew

Both of the behaviors are required by the XML 1.0 REC, and that's why
JAXB 1.0 is doing it.

Two options:

1) supply an entity resolver (org.xml.sax.EntityResolver) and redirect
references to remote resources to local ones.

2) if the performance is super important for you and you are willing to
bend the rules of XML 1.0, then you can configure an XML parser to
ignore DTD completely. Xerces lets you do this through a property.

You code will look like:

    XMLReader reader = .... create an XML reader ...
    reader.setProperty( ... ); // configure it

    unmarshaller.unmarshal( new SAXSource(reader,yourInputSource) );

JAXB RI will simply use the parser you specify, so it will work as you
expect.

regards,
--
Kohsuke KAWAGUCHI                  408-276-7063 (x17063)
Sun Microsystems                   kohsuke.kawaguchi_at_sun.com