users@jaxb.java.net

Re: Slow performance of Unmarshaller when reading xsd:any element

From: Iaroslav Savytskyi <iaroslav.savytskyi_at_oracle.com>
Date: Mon, 23 Dec 2013 16:39:38 +0100

HI, Pieter,

You can try to report an issue on our bug tracking system: https://java.net/jira/browse/JAXB

Thanks.

Best regards,
Iaroslav

On 20 Dec 2013, at 17:46, <buzing_at_riscure.com> <buzing_at_riscure.com> wrote:

> Hello!
>
> Some time ago I reported an issue at the Oracle website about a
> significant performance degradation of the
> Unmarshaller.unmarshal(InputStream) method using jdk 1.7.0_04 (or
> newer). This problem only occurs when there's an "xsd:any" element.
> However, I don't think my bug report got any attention and I also think
> the site is unusable. So that's why I turn to you guys.
>
> My questions:
> - Is this a known issue and is it being worked on? (I couldn't find
> anything about this via google or in the mailing list archive)
> - Is there a work-around?
> - What is the proper way to report such issues?
>
> Below you can find a description of the bug, including an example.
> Thanks in advance for your time!
>
> Greetings,
> Pieter
> --
>
> Date Created: Thu Mar 28 08:28:09 MST 2013
> Type: bug
> Customer Name: Pieter Buzing
> Customer Email: Buzing_at_riscure.com
> SDN ID:
> status: Waiting
> Category: jaxb-xsd
> Subcategory: runtime
> Company: Riscure BV
> release: 1.0.4
> hardware: x64
> OSversion: windows_7
> priority: 3
> Synopsis: Performance degradation between java 7u3 and 7u4 for xml
> parsing of "xsd:any"
> Description:
> FULL PRODUCT VERSION :
> java version "1.7.0_04"
> Java(TM) SE Runtime Environment (build 1.7.0_04-b22) Java HotSpot(TM)
> 64-Bit Server VM (build 23.0-b21, mixed mode)
>
> ADDITIONAL OS VERSION INFORMATION :
> Microsoft Windows [Version 6.1.7601]
>
> A DESCRIPTION OF THE PROBLEM :
> We observe a performance degradation between java 7u3 and 7u4 with
> regard to the xml parsing of an "xsd:any" element with JAXB2. Also
> later java updates have the same poor performance.
>
> Given:
> - an xsd schema which contains an "xsd:any" element and which is
> compiled with xjc (version 2.2.4)
> - an xml file that conforms to the above xsd schema and contains (a lot
> of) <xsd:any> elements
> - a java class that unmarshals the above xml file
> - a Windows 7 computer with both jdk 7u3 and a later update installed
>
> When we run the java unmarshal code with jdk1.7.0_04\bin\java.exe (7u4)
> we observe that the required time is at least twice the runtime needed
> by jdk1.7.0_03\bin\java.exe (7u3).
>
> REGRESSION. Last worked in version 1.0.4
>
> STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
> 1. Prepare an xsd file with an xs:any element (like in the comments in
> the java code), compiling it with xjc.
> 2. Prepare a matching xml file that contains (a lot of) xs:any elements
> (see also the comments in the java file).
> 3. Compile and run java code that unmarshals the xml data (like the
> supplied example code).
> 4. Observe that running the code with jdk7u3 is at least twice as fast
> compared to later updates.
>
> EXPECTED VERSUS ACTUAL BEHAVIOR :
> EXPECTED -
> The expected behavior would be an equal runtime for both java 7u3 and
> 7u4.
> ACTUAL -
> We observed a performance difference between different java runtime
> environments:
>
> java version = 1.7.0_03
> .file size = 1613 bytes, read took 36 ms
>
> java version = 1.7.0_04
> .file size = 1613 bytes, read took 75 ms
>
> When we increase the size of the xml file (or read multiple xml files)
> the relative performance difference grows. Also for later updates like
> 7u17 we see the same performance degradation compared to 7u3.
>
> REPRODUCIBILITY :
> This bug can be reproduced always.
>
> ---------- BEGIN SOURCE ----------
> import java.io.File;
> import java.io.FileInputStream;
> import java.io.FileNotFoundException;
> import java.io.InputStream;
> import java.util.ArrayList;
> import java.util.List;
> import javax.xml.bind.JAXBContext;
> import javax.xml.bind.JAXBException;
> import javax.xml.bind.Unmarshaller;
> import anyexample.Root;
> import anyexample.Root.Anyproperties;
>
> /**
> This code demonstrates a performance bug in java 7u4. Steps to
> replicate this bug:
>
> 1. Save and compile this xsd schema:
> <?xml version="1.0" encoding="utf-8" ?>
> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
> <xs:element name='Root'>
> <xs:complexType>
> <xs:sequence>
> <xs:element minOccurs="1" maxOccurs="unbounded"
> name="anyproperties">
> <xs:complexType>
> <xs:sequence>
> <xs:any minOccurs="0" maxOccurs="unbounded"
> />
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> </xs:sequence>
> </xs:complexType>
> </xs:element>
> </xs:schema>
>
> The code assumes that the resulting class files are stored in a
> directory called anyexample.
>
> 2. Store the following xml file as 'anyprops.xml':
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Root>
> <anyproperties>
> <Prop1>0</Prop1>
> <Prop2>Inverse</Prop2>
> <Prop3>bits</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>1</Prop1>
> <Prop2>out</Prop2>
> <Prop3>bits</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>3</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>4</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>bar</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>5</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>6</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo1</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>7</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo2</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>8</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo3</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>9</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo4</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> <anyproperties>
> <Prop1>10</Prop1>
> <Prop2>bla</Prop2>
> <Prop3>foo5</Prop3>
> <Prop4>a</Prop4>
> <Prop5>b</Prop5>
> <Prop6>c</Prop6>
> </anyproperties>
> </Root>
>
> 3. Run the code below with both jdk7u3 and jdk7u4 (or later) and
> observe a significant performance difference.
> */
> public class AnyPropertiesTest {
>
> private JAXBContext anyContext;
> private Unmarshaller anyUnmarshaller;
>
> public static final String PATH = "anyprops.xml";
>
> public AnyPropertiesTest() {
> System.out.println("java version = " +
> System.getProperty("java.version"));
> try {
> if (anyContext == null) {
> anyContext = JAXBContext.newInstance(Root.class);
> anyUnmarshaller = anyContext.createUnmarshaller();
> }
> } catch (JAXBException e) {
> e.printStackTrace();
> }
> }
>
> public void testAny() {
> long startTime = System.nanoTime();
> File inputFile = new File(PATH);
> List<Anyproperties> propertiesList = getProperties(inputFile);
> long duration = (System.nanoTime() - startTime) / 1_000_000;
> String s = String.format("read %d properties in %d ms",
> propertiesList.size(), duration);
> System.out.println(s);
> }
>
> private List<Anyproperties> getProperties(File path) {
> List<Anyproperties> properties;
>
> try {
> System.out.printf("file size = %d bytes, ", path.length());
> properties = read(new FileInputStream(path));
> } catch (JAXBException e) {
> System.out.println(String.format(
> "File '%s' could not be read by JAXB: might not be
> a valid XML file",
> path.getAbsolutePath()));
>
> properties = null;
> } catch (FileNotFoundException e) {
> // Impossible as path was obtained from listFiles
> String errorMessage = String.format("File '%s' obtained by
> File.listFiles() does not exist",
> path.getAbsolutePath());
> throw new RuntimeException(errorMessage, e);
> }
> return properties;
> }
>
> private synchronized List<Anyproperties> read(InputStream stream)
> throws JAXBException {
> List<Anyproperties> templates = null;
> long startTime = System.nanoTime();
> //It is the unmarshall method that takes a considerable
> amount of time on java 7u4, compared to 7u3
> Root root = (Root) anyUnmarshaller.unmarshal(stream);
> long duration = (System.nanoTime() - startTime)/1_000_000;
> System.out.printf("read took %d ms\n", duration);
>
> templates = (List<Anyproperties>) root.getAnyproperties();
> return templates;
> }
>
> public static void main(String[] argv) {
> AnyPropertiesTest test = new AnyPropertiesTest();
> test.testAny();
> }
> }
>
> ---------- END SOURCE ----------
>
> CUSTOMER SUBMITTED WORKAROUND :
> In order to deal with this issue we ship our product with java 7u3.
> Upgrading to update 4 or later has a considerable performance impact
> (we use large xml files).
> The obvious solution is to avoid <xsd:any> elements, but this is not
> always feasible.