users@jaxb.java.net

Re: Custom mapping for Maps in JAXB

From: Wolfgang Laun <wolfgang.laun_at_gmail.com>
Date: Fri, 18 Jun 2010 20:03:27 +0200

Folks,

I have duly noted Aleksei's opinion and Leo's contribution hinting at
Maven and POM.

It's obvious that XML defines a syntax that can be used freely; but if
you want to have a well defined structure (e.g., according to XML
Schema) or one that can be conveniently processed with related
technologies, you'll have to make amends. I'd like you to note the
upshoot of an interesting exchange inspired by this topic I triggered
on another list where highly experienced XML-ers contribute. Some
replied that <k>v</k> is "possible, but not so good", many preferred
<entry key="k">v</entry> or similar; but below is one reply I quote in
full.

On 18/06/2010 07:28, Wolfgang Laun wrote:

    Every now and then, people (not me) want to represent a Map<K,V>
in XML by using
    s.th. like
       <map>
         <k1>v1</k1>
         <k2>v2</k2>
         ...
       </map>
    with ki from K and vi from V. Apart from the obvious limitation
for K's values,
    I feel that this is somehow violating the spirit of XML. But this
is not a list
    for XML, and I don't want to risk a red or yellow card.

    So, more specifically: Doesn't such a "structure" complicate the writing
    of XSLT constructs? Aren't there any statements or expressions that
    won't be usable at all? (I don't need an exhaustive list of what isn't
    possible - I'm more interested in a general judgment.)

Michael Kay wrote:
I agree: in general it's a poor way of using XML, and it makes it more
difficult to process using XSLT.

The exception is when the set (k1, k2, k3....) is very predictable and
unlikely to change. There's room for debate about this. A schema used
in the XMark benchmark uses the names of continents as element names
(<Asia>, <Europe>, etc). That feels wrong to me, just as it would feel
wrong to use the names of continents as Java variable names. There are
many borderline cases: should one use <home-phone>, <work-phone>, and
<mobile-phone>, or should one use <phone role="home"> etc? Probably
the latter, because it makes it easier to change the set of roles and
easier to process all phone numbers in a generic way. But in the end,
drawing the line between data and metadata is subjective.

The objection about XSLT processing can in principle be overcome if
the elements in the set are declared in a schema as having either a
common type or as members of a substitution group; doing this gives
you a handle in a schema-aware stylesheet to define match patterns
that match all elements in the set, or path expressions that select
them all. But fixing the set of values in a schema in some ways
compounds the error, because it makes it even more difficult to change
the value set over time as requirements evolve.

SGML old-timers at this point will start reminiscing about
architectural forms. I mention this only to point out that it's an
issue that has been around for a long time.

Best
-W