The Berkeley DB XML Package: BDB XML 2.5.16 Change Log

2.5 Release Overview

Release 2.5 is primarily a feature release with a small number of useful features including:

Automatic indexing of leaf elements and attributes
Whole Document container compression with optional user-defined compression in C++ and Java
Improvements in node storage containers that reduce total size of containers
User-defined external XQuery extension functions in C++, Java and Python
XQuery debug API in C++, Java and Python
Improvements in the XmlResults class enabling better offline results handling

Berkeley DB XML 2.5.16 Change Log

BDB XML 2.5.16 is a bug-fix release that addresses a number of issues found since release of 2.5.13. It is source and binary compatible with earlier 2.5.x releases. This section describes changes in BDB XML relative to release 2.5.13.

Upgrade Requirements

None relative to 2.5.13. See 2.5.13 for upgrade details and recommendations.

General Functionality Changes:

Upgraded the packaged version of Berkeley DB to 4.8.26. Please see the Berkeley DB specific change log for relevant changes.
Fixed container creation so that it honors page size in XmlContainerConfig [#17803]
Fix the base-uri of an attribute node when using WholedocContainer storage [#17872]
Fixed an assertion failure during query preparation with a recursive user defined function [#17866]
Fixed an assertion failure when an as-yet unseen URI in used in a query [#17867]
Fixed a problem where attribute indexes would not properly be updated if there were no element indexes present. This might have a symptom of DB_NOTFOUND errors or duplicate index entries for attributes [#17671]
Changed the algorithm used to create node IDs during partial update to be more efficient and create shorter node IDs in general [#17844]
Fixed a problem where deleting the XmlResults object returned by XmlValue.getAttributes() might cause an exception when the original XmlResults for the XmlValue object was next accessed [#17796]
XQuery Update queries will no longer crash when statistics are disabled [#17898]
Fixed a bug in document level indexing that could result in index entries being deleted inappropriately when a node was deleted [#17758
Fixed fn:doc() to raise an error in all cases if the document does not exist [#17870]
Fixed a bug occurring when fn:subsequence() and "order by" were used in certain configurations [#17932]
Changed XmlResults.asEventWriter(), now only one active XmlEventWriter is allowed for an XmlResults object [#18049]

Utility Changes:

Java-specific Functionality Changes:

Deleting the XmlResults returned by XmlValue.getAttributes() will no longer cause exceptions to be thrown when accessing other XmlResults.[#17792].
Fixed a few memory leak issues in JNI code. Fixed leaks may happen in XmlEventWriter.writeXXX() methods, XmlDocuments.getContent(), XmlEventReaderToWriter.start(), XmlResults.copyResults(), XmlResults.concatResults().[#18049].

Python-specific Functionality Changes

The latest Python code from the pybsddb project is bundled.

Perl-specific Functionality Changes:

PHP-specific Functionality Changes:

Example Code Changes

Configuration, Documentation, Portability and Build Changes:

The zlib library will now be copied to both the release and debug directories on Windows.[#17894]

Berkeley DB XML 2.5.13 Change Log

Upgrade Requirements

Containers do not require upgrade; however because 2.5 bundles Berkeley DB 4.8 environment directories and log files will not be compatible with previous releases. This means checkpoint, backup and recovery procedures are necessary to start with a clean environment especially for transactional environments.

New Features:

Automatic indexing of leaf elements and attributes. When a container is set in the auto-indexing state it will automatically detect new leaf elements and attributes and add string and double indexes for them (node-*-equality-string and node-*-equality-double). This feature has been added to enhance out-of-the-box performance of queries and can be used to replace default indexes in most cases. Default value indexes have a tendency to over-index mixed content. Newly-created containers will be in this state, which is controlled by new API outlined below; some applications may wish to disable automatic indexing immediately after creating a container. Addition of new indexes can be a significant operation so this behavior is best for containers that store similar documents or it must be carefully controlled by the application. When enabled, addition of any new content can trigger the equivalent of XmlContainer::setIndexSpecification(). [#15722]

Data compression for whole document storage. XML documents can now be compressed using a built-in implementation based on the zlib compression library or user defined compression created by implementing the class XmlCompression. [#15471]

Debugging API and command line debugger. The command line debugger can be started using the "debug" command in the DB XML shell. The debugging API is used by deriving a class from XmlDebugListener and registering it with the XmlQueryContext::setDebugListener() method. Access to the stack trace and dynamic context in each stack frame is available through the XmlStackFrame class during debugging. [#15999]

XQuery external functions. The XmlExternalFunction class allows the implementation of external XQuery functions in C++, Java and Python. Users should derive a class from XmlExternalFunction, implementing the execute() method to perform the function's action. Arguments are provided to the execute() method via the XmlArguments class. [#15610]

Enhanced results handling
- Added XmlResults::asEventWriter() which can be called on an empty (newly-created) XmlResults object to construct a sequence of elements and/or atomic values that can be used in queries. The constructed object must be assigned to a variable to be used. [#16355]
- Added XmlResults::copyResults() and XmlResults::concatResults() which can be used to create and use "transient" copies of XmlResults which can be used outside of the context of a transaction or container
Better node storage efficiency. The storage algorithm for node storage containers has been modified to provide a better btree fill factor, resulting in smaller container files

API Changes:

Unless otherwise noted, the API additions apply to all language bindings, and all bindings use the same method name.

There are new classes for the external XQuery function implementation noted above under features. These are XmlExternalFunction and XmlArguments and are available only in C++, Java and Python.

New interfaces control automatic indexing behavior:
- XmlIndexSpecification::setAutoIndexing(bool) and XmlContainer::setAutoIndexing()
  Use these to control automatic indexing. The method on XmlIndexSpecification will only take effect once XmlContainer::setIndexSpecification() is called with the modified XmlIndexSpecification object. The method on XmlContainer is a convenience method that uses its underlying XmlIndexSpecification
- bool XmlIndexSpecification::getAutoIndexing() and XmlContainer::getAutoIndexing()
  Returns the current state of automatic indexing for the container. The method on XmlContainer is a convenience method that returns the state from the XmlIndexSpecification
The modified XmlIndexSpecficiation instance must be set on the container using XmlContainer::setIndexSpecification() in order for the state change to occur. The state is persistent and is set to true for newly-created containers.
Added XmlContainerConfig to the API. XmlContainerConfig already existed in the Java API but has been expanded and is now available in all other language bindings. XmlContainerConfig simplifies opening and creating of containers and is intended to replace the flags arguments to operations that create and open containers. The flags versions still exist for now but will eventually disappear.
Added XmlCompression as a class that can be used to create user-defined compression algorithms for wholedoc containers. This class is available only in C++ and Java
Added constructor XmlValue(typeURI, typeName, value) for creating atomic values with derived types.
Added new functions to XmlData and changed it so that it functions as an actual buffer for binary data rather than a wrapper for an existing buffer.
Added the functions XmlValue.getResults() and XmlDocument.getResults() to the Java API to return the XmlResults object (if any) associated with the current object [#16352]
Removed all finalize() functions from the Java API. They served no useful purpose and could only cause problems by running at inconvenient times [#16352]
Added XmlResults::asEventWriter() -- see description above under "Features" [#16355]
Configuring a database as an XA-compliant resource manager using the flag DB_XA_CREATE is no longer supported because XA support has been removed from Berkeley DB [#16912]
XmlModify has been removed, XQuery Update should be used instead [#16915]
The Berkeley DB C++ objects in the C++ API have been replaced with their C equivalents. This simplifies infrastructure and improves build/linking with the non-C++ interfaces. It requires changes for C++ applications where they may have used DbEnv or DbTxn [#16951]
Added the function XmlIndexSpecification.getValueType(index) that returns the XmlValue::Type described in the given index description. [#17362]

Changes That May Require Application Modification:

C++ applications are required to change the use of the Berkeley DB C++ objects in the public interface to their C equivalents. Such changes are mechanical, replacing DbEnv* with DB_ENV * and DbTxn * with DB_TXN * and only a few interfaces are affected. For example, XmlManager::XmlManager(DbEnv *, u_int32_t) becomes XmlManager::XmlManager::(DB_ENV *, u_int32_t). The DbEnv object has a method, DB_ENV *DbEnv::get_DB_ENV() that can be used. Similarly, DbTxn has a method DB_TXN *DbTxn::get_DB_TXN() that can be used.

Automatic indexing may require changes. It is enabled on newly-created containers by default and if an application wishes to not have this feature enabled it will need to explicitly disable it, post-creation. Most applications will eventually want to disable this state once they are confident that all useful indexes have been added. An application that wants very explicit control over its indexes should disable it. If it is not desired at all then immediately after creating a container call execute this sequence of operations (some pieces are missing and this example does not use transactions but these are the calls):
XmlIndexSpecification is = container.getIndexSpecification(); is.setAutoIndexing(false); container.setIndexSpecification(is, updateContext);
If compression is compiled in, which is the default, wholedoc containers are by default compressed using zlib compression. This can be disabled using interfaces on XmlContainerConfig when creating the container. Existing containers are not affected

While the addition of XmlContainerConfig to the non-Java APIs does not require change it is recommended that applications move to the methods that use XmlContainerConfig as eventually the old interfaces will be phased out

The functions in the XmlData class that use the Dbt object have been removed, including one constructor, set_data and getDbt

Java objects that require clean up must be cleaned up manually by calling the delete() function. Failure to clean up objects can result in memory leaks and the need to run database recovery. This has always been the case but removal of finalizers has made it even more important for memory leak situations

The functions XmlContainer.addIndex and XmlIndexSpecification.addIndex will now throw an exception if passed the index types XmlValue.DAY_TIME_DURATION, XmlValue.YEAR_MONTH_DURATION, or XmlValue.UNTYPED_ATOMIC. If indexing the types XmlValue.DAY_TIME_DURATION or XmlValue.YEAR_MONTH_DURATION use XmlValue.DURATION. If indexing the type XmlValue.UNTYPED_ATOMIC use XmlValue.STRING. [#17365]

General Functionality Changes:

The release bundles Berkeley DB 4.7.25

Added a "--disable-rpath" option to the configure script, to facilitate building embedding rpath information in libraries [#16607]

Fixed an uninitialized variable in NsEventWriter that could affect use of XmlEventWriter [#16459]

Fixed a bug where putting a document from one container into another would result in an empty document in the second container [#16456]

Fixed a problem where a deadlock exception in XmlEventWriter would mistakenly be reported as EINVAL and lost [#16343]

Fixed a bug where inserting a new root element into a document would not properly index the new content [#16500]

The behavior of XQuery Update (and XmlModify) was changed so that multiple document elements are no longer allowed. XQuery Update can also no longer be used to remove the document element to create an empty document. Such documents can still be created but only via XmlContainer::putDocument() and XmlContainer::updateDocument() [#16500]

Fixed a problem where text (comment, PI, text) updates that affect elements that own multiple text nodes could trigger an assertion failure or bad memory reference [#16543]

Fixed a problem where the behavior of eager and lazy results iteration was not consistent [#16484]

Fixed a bug where constructed documents could not be created from an XmlInputStream. [#16593]

XmlValues created from an empty document will no longer crash on calls to certain functions. [#16608]

XmlInputStream will no longer lose its source if the XmlDocument it came from is deleted. Also, XmlDocument.getContentAsXmlInputStream() will now always consume the content of constructed documents. [#16617]

Fixed a static initialization problem that appears on some Windows platforms related to NsNid and results in an exception during XmlManager construction. [#16565]

Fixed an assertion triggered when using a predicate against a variable containing constructed nodes. [#16556]

Fixed a problem where variable references to deleted nodes could lead to problems or incorrect behavior [#16583]

Fixed a segmentation fault that could occur if the last step of a path was a comparison or contains() function. [#16772]

The flags DBXML_ENCRYPT and DBXML_CHKSUM will no longer result in an exception when used correctly. [#16677]

Fixed an assertion failure that could happen when using numeric predicates. [#16775]

Fixed a problem where a transactional XQuery Update expression using fn:put() would fail during the transaction commit, indicating that the transaction was already committed[#16808]

Change the close() methods in XmlEventReader and XmlEventWriter to be pure virtual. Implementors of XmlEvent* must implement the close() method which may need to delete "this" in order to free the memory. [#16771]

Fixed XmlContainer::putDocument() and updateDocument() on wholedoc containers to ensure that new namespace uri prefixes are added to the dictionary. This could result in an exception during queries of read-only content or stray updates during read operations [#17212]

Fixed a problem in partial updates where a delete of a node when it has an ancestor with a presence index could result in removal of the ancestor's index, resulting in incorrect query results [#17199]

Fixed some issues in partial updates that might result in problems with indexes or missing records in the case where mixed content was being indexed and a descendent node is deleted or modified [#17226]

Improved performance of partial reindexing when inserting a new element into a node that already has a large number of child elements [#17393]

The functions XmlContainer.addIndex and XmlIndexSpecification.addIndex will now throw an exception if passed the index types XmlValue.DAY_TIME_DURATION, XmlValue.YEAR_MONTH_DURATION, or XmlValue.UNTYPED_ATOMIC.

Fixed problems with partial updates and statistics that affected both partial update performance and query plans resulting in lowered performance after a number of updates[#17393]

Fixed XmlManager::compactContainer() so that space made available is released to the file system on platforms that support this behavior [#17658]

Fixed an optimizer issue where certain range queries might not use an index if appropriate [#17649]

Fixed a partial update scenario where indexes could get corrupted resulting in DB_NOTFOUND errors during queries or index lookups. This could only occur when inserting multiple elements into the same parent node and not all the time [#17649]

Utility Changes:

dbxml shell subcommands that reflected the XmlModify interface have been removed

Java-specific Functionality Changes:

Fixed a bug where accessing XmlValue objects created from XQuery constructed nodes would cause a crash. [#16403]

Added the functions XmlValue.getResults() and XmlDocument.getResults() to the Java API. [#16352]

Fix a bug where updating queries that use nodes as variables would cause the JVM to crash. [#16583]

Fix a bug where setVariableValue in Java API use XmlResult::size() which lazily evaluated does not support.

Eliminated the possibility of XmlResolver objects being garbage collected while the object is still needed.[#16595]

Python-specific Functionality Changes

The latest Python code from the pybsddb project is bundled.

Modified interfaces that can legitimately return a NULL value (in C++ or Java) to return None in Python. [#16678]

Fixed exception class constructors for XmlDatabaseError and XmlException. Arguments were out of order. [#16628]

Fixed a bug in the Python bindings for XmlEventWriter::writeText() [#16626]

Fixed a typo that made XmlInvalidValue exception unavailable[#16711]

Perl-specific Functionality Changes:

The Perl Db module included with BDB XML still uses the Berkeley DB C++ API. This is not a change but a non-change that is the only place remaining in the product bundle that still uses the Berkeley DB C++ API. This does not affect other languages at all.

Added --perl-installdir to the buildall.sh script to allow users to change the installation directory for perl packages

PHP-specific Functionality Changes:

Example Code Changes

C++ code uses XmlContainerConfig rather than flags

C++ examples have been rewritten to use the Berkeley DB C API where appropriate

Examples have been added to illustrate external XQuery functions in C++, Java and Python

Examples have been added to illustrate use of Wholedoc container compression in C++ and Java

Examples have been added to illustrate use of the XQuery debug API in C++, Java and Python

Examples have been added to illustrate use of Berkeley DB XML with threads and in a server in Java

Configuration, Documentation, Portability and Build Changes:

The build system for the BDB XML library on *nix now uses automake for better maintainability and portability

Fixed XmlEventReader documentation to properly indicate that empty elements will not result in an EndElement event [#17213]