users@jaxb.java.net

Slow performance of Unmarshaller when reading xsd:any element

From: <buzing_at_riscure.com>
Date: Fri, 20 Dec 2013 16:46:28 +0000 (UTC)

Hello!

Some time ago I reported an issue at the Oracle website about a
significant performance degradation of the
Unmarshaller.unmarshal(InputStream) method using jdk 1.7.0_04 (or
newer). This problem only occurs when there's an "xsd:any" element.
However, I don't think my bug report got any attention and I also think
the site is unusable. So that's why I turn to you guys.

My questions:
- Is this a known issue and is it being worked on? (I couldn't find
anything about this via google or in the mailing list archive)
- Is there a work-around?
- What is the proper way to report such issues?

Below you can find a description of the bug, including an example.
Thanks in advance for your time!

Greetings,
Pieter
--
Date Created: Thu Mar 28 08:28:09 MST 2013
Type:	     bug
Customer Name:	 Pieter Buzing
Customer Email:  Buzing_at_riscure.com
SDN ID:       
status:      Waiting
Category:    jaxb-xsd
Subcategory: runtime
Company:     Riscure BV
release:     1.0.4
hardware:    x64
OSversion:   windows_7
priority:    3
Synopsis:    Performance degradation between java 7u3 and 7u4 for xml
parsing of "xsd:any"
Description:
 FULL PRODUCT VERSION :
java version "1.7.0_04"
Java(TM) SE Runtime Environment (build 1.7.0_04-b22) Java HotSpot(TM)
64-Bit Server VM (build 23.0-b21, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows [Version 6.1.7601]
A DESCRIPTION OF THE PROBLEM :
We observe a performance degradation between java 7u3 and 7u4 with
regard to the xml parsing of an "xsd:any" element with JAXB2. Also
later java updates have the same poor performance.
Given:
- an xsd schema which contains an "xsd:any" element and which is
compiled with xjc (version 2.2.4)
- an xml file that conforms to the above xsd schema and contains (a lot
of) <xsd:any> elements
- a java class that unmarshals the above xml file
- a Windows 7 computer with both jdk 7u3 and a later update installed
When we run the java unmarshal code with jdk1.7.0_04\bin\java.exe (7u4)
we observe that the required time is at least twice the runtime needed
by jdk1.7.0_03\bin\java.exe (7u3).
REGRESSION.  Last worked in version 1.0.4
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1. Prepare an xsd file with an xs:any element (like in the comments in
the java code), compiling it with xjc.
2. Prepare a matching xml file that contains (a lot of) xs:any elements
(see also the comments in the java file).
3. Compile and run java code that unmarshals the xml data (like the
supplied example code).
4. Observe that running the code with jdk7u3 is at least twice as fast
compared to later updates.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
The expected behavior would be an equal runtime for both java 7u3 and
7u4.
ACTUAL -
We observed a performance difference between different java runtime
environments:
java version = 1.7.0_03
.file size = 1613 bytes, read took 36 ms
java version = 1.7.0_04
.file size = 1613 bytes, read took 75 ms
When we increase the size of the xml file (or read multiple xml files)
the relative performance difference grows. Also for later updates like
7u17 we see the same performance degradation compared to 7u3.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.List;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Unmarshaller;
import anyexample.Root;
import anyexample.Root.Anyproperties;
/**
This code demonstrates a performance bug in java 7u4. Steps to
replicate this bug:
1. Save and compile this xsd schema:
<?xml version="1.0" encoding="utf-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name='Root'>
	<xs:complexType>
	    <xs:sequence>
		<xs:element minOccurs="1" maxOccurs="unbounded"
name="anyproperties">
		    <xs:complexType>
			<xs:sequence>
			    <xs:any minOccurs="0" maxOccurs="unbounded"
/>
			</xs:sequence>
		    </xs:complexType>
		</xs:element>
	    </xs:sequence>
	</xs:complexType>
    </xs:element>
</xs:schema>
The code assumes that the resulting class files are stored in a
directory called anyexample.
2. Store the following xml file as 'anyprops.xml':
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <Root>
	<anyproperties>
		<Prop1>0</Prop1>
		<Prop2>Inverse</Prop2>
		<Prop3>bits</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>1</Prop1>
		<Prop2>out</Prop2>
		<Prop3>bits</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>3</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>4</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>bar</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>5</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>6</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo1</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>7</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo2</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>8</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo3</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>9</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo4</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
	<anyproperties>
		<Prop1>10</Prop1>
		<Prop2>bla</Prop2>
		<Prop3>foo5</Prop3>
		<Prop4>a</Prop4>
		<Prop5>b</Prop5>
		<Prop6>c</Prop6>
	</anyproperties>
</Root>
3. Run the code below with both jdk7u3 and jdk7u4 (or later) and
observe a significant performance difference.
*/
public class AnyPropertiesTest {
    private JAXBContext anyContext;
    private Unmarshaller anyUnmarshaller;
    public static final String PATH = "anyprops.xml";
    public AnyPropertiesTest() {
	System.out.println("java version = " +
System.getProperty("java.version"));
	try {
	    if (anyContext == null) {
		anyContext = JAXBContext.newInstance(Root.class);
		anyUnmarshaller = anyContext.createUnmarshaller();
	    }
	} catch (JAXBException e) {
	    e.printStackTrace();
	}
    }
    public void testAny() {
	long startTime = System.nanoTime();
	File inputFile = new File(PATH);
	List<Anyproperties> propertiesList = getProperties(inputFile);
	long duration = (System.nanoTime() - startTime) / 1_000_000;
	String s = String.format("read %d properties in %d ms",
propertiesList.size(), duration);
	System.out.println(s);
    }
    private List<Anyproperties> getProperties(File path) {
	List<Anyproperties> properties;
	try {
	    System.out.printf("file size = %d bytes, ", path.length());
	    properties = read(new FileInputStream(path));
	} catch (JAXBException e) {
	    System.out.println(String.format(
		    "File '%s' could not be read by JAXB: might not be
a valid XML file",
		    path.getAbsolutePath()));
	    properties = null;
	} catch (FileNotFoundException e) {
	    // Impossible as path was obtained from listFiles
	    String errorMessage = String.format("File '%s' obtained by
File.listFiles() does not exist",
		    path.getAbsolutePath());
	    throw new RuntimeException(errorMessage, e);
	}
	return properties;
    }
    private synchronized List<Anyproperties> read(InputStream stream)
throws JAXBException {
	List<Anyproperties> templates = null;
	long startTime = System.nanoTime();
		//It is the unmarshall method that takes a considerable
amount of time on java 7u4, compared to 7u3
	Root root = (Root) anyUnmarshaller.unmarshal(stream);
	long duration = (System.nanoTime() - startTime)/1_000_000;
	System.out.printf("read took %d ms\n", duration);
	
	templates = (List<Anyproperties>) root.getAnyproperties();
	return templates;
    }
	
    public static void main(String[] argv) {
	AnyPropertiesTest test = new AnyPropertiesTest();
	test.testAny();
    }
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
In order to deal with this issue we ship our product with java 7u3.
Upgrading to update 4 or later has a considerable performance impact
(we use large xml files).
The obvious solution is to avoid <xsd:any> elements, but this is not
always feasible.