UIX Developer's Guide
Go to Table of Contents
Contents
Go to previous page
Previous
Go to next page
Next

7. Internationalization

This chapter describes how to develop international applications in UIX, including how to write UIX Components and uiXML pages that are automatically localized at runtime, how to deliver pages in the correct locale and character sets, and how to support bidirectional languages.

This chapter contains the following sections:

Introduction

It should come as no surprise that you need to deploy your application to many users in many different languages. If you're experienced with Java, you'll know some of the good news about making servlets "NLS (National Language Support) compliant". Java internally works with Unicode strings, so string handling is generally compliant. And the Java ResourceBundle mechanism is a robust standard for accessing translating text. But servlets present a number of problems that traditional Java applications haven't had to worry about.

First, a classic Java application might support multiple Locales - but it usually only had to run against one at a time. With a web application, a single web server needs to support many users, each in their own Locale. You'll learn how to write UIX Components or uiXML pages so that a single page can run in multiple languages. You'll also learn how UIX automatically detects the user's preferred Locale, and how to override that choice.

Second, while your Java server might be happy working with nothing but Unicode strings, a user's browser can run in any number of "character encodings". A "character encoding" is a mapping of text - be they numbers, English letters, or Kanji glyphs - into a stream of bytes. While Java's Unicode support is convenient, it (or to be precise, the UTF-16 character encoding it uses) requires two or four bytes for each character, and that chews up bandwidth. The default HTML encoding, the Western European ISO-8859-1, only uses one byte for each character, but it only supports a tiny fraction of Unicode characters. We'll explain how UIX handles character set encodings for you, and how you can override its behavior if needed.

Third, you'll see how to write your pages to support bidirectional languages, like Arabic or Hebrew. With UIX Components and uiXML, it's pretty easy!

Along the way, you'll learn about a few tasks that web application developers have needed to work hard to implement correctly - and that we handle automatically and transparently.

Localization: Translating UIX Components and uiXML

The UIX LocaleContext API

The oracle.cabo.share.nls.LocaleContext API is the key to localization in UIX. It contains the information you need to properly translate your pages: a Locale, the reading direction (left-to-right, or right-to-left), and a TimeZone. It also has a cache of ResourceBundle objects, saving you the expense of repeated calls to the expensive ResourceBundle.getBundle() Java API.

You can get a LocaleContext off of almost any of our other context objects. The UIX Components RenderingContext and UIX Controller BajaContext interfaces both have getLocaleContext methods. Even better, our default implementations of these RenderingContext and BajaContext automatically examine the user's request and guess at the desired Locale using the accept-languages HTTP header. So, for most applications, you don't need to lift a finger to pick a Locale; we'll find the correct one for you. A little later, you'll learn how to override our choice in case you need to, but let's see now how to take advantage of LocaleContext.

Translating UIX Components

One approach to translating UIX Components is obvious. Create your UIX Components beans anew on each request, and set the correctly translated text:

  Locale locale = localeContext.getLocale();
  ResourceBundle bundle =
    ResourceBundle.getBundle(_BUNDLE_CLASS_NAME, locale);
                             
  ButtonBean button = new ButtonBean();
  button.setText(bundle.getString("buttonText"));

But this means that you either need to create the button - and every other bit of your page - on every request, or you need to save one copy for every locale. This is expensive! It'd be much better if you could create the bean once, and ask it to get the localized text on its own. UIX Components supports this, and supports it the same way it supports any dynamic output - data binding. (If you haven't read the Data Binding chapter, you might want to now.)

It all happens with a simple convenience class: oracle.cabo.ui.data.bind.BundleBoundValue. This class takes a ResourceBundle name and a key, and gives you a BoundValue that will get the translated text automatically.

  ButtonBean button = new ButtonBean();
  button.setTextBinding(new BundleBoundValue(_BUNDLE_CLASS_NAME,
                                            "buttonText"));

Done! This button will automatically get the translated text at runtime.

Translating UIX

Like UIX Components, one approach to translating uiXML is obvious. Create your UIX page in English, then hand the file as a whole to a translation team. uiXML is better than a lot of file formats - JSPs, for instance, are a major nightmare to translators - but even if you can persuade translators to handle this task, you've still added a great deal of redundancy to your system. Not only do all these files have to be loaded at runtime - which wastes memory - but anytime you tweak a page, you have to tweak every copy of it. And anytime a customer wants to customize a page, that customer has to customize every copy of it. There should be a better way, and there is.

UIX users can use an approach similar to that used by UIX Components developers, but the preferred approach uses the <dataScope> element, and a new element we'll introduce here: <bundle>.

Let's review the basics of the <dataScope> element. This element exposes sources of data to all the elements it contains. Inside its <provider> element, you add a series of <data> elements, each of which has a single "name" attribute:

 <dataScope>
   <provider>
     <data name="...">
        ... a data provider ...
     </data>

     <data name="...">
        ... a second data provider ...
     </data>
   </provider>
   <contents>
       ... Everything in here can use this data ...
   </contents>
 </dataScope>

The <bundle> element is just another data provider element, like <inline> and <method>. But <bundle> serves up ResourceBundles. It takes a single attribute: "class", which is the full Java class name of the ResourceBundle. Once defined, it exposes all of its contents as key/value pairs:


 <dataScope>
   <provider>
     <data name="nls">
       <bundle class="your.Bundle"/>
     </data>
   </provider>
   <contents>

      <button data:text="buttonText@nls"/>

   </contents>
 </dataScope>

That's all there is to it! Bind your translated text to this new "nls" data provider, and it will all be automatically displayed at runtime. Of course, you can bind to multiple ResourceBundles using multiple <bundle> elements (each inside its own <data> element). And, generally, your page will have its <dataScope> element at the very top, so you can declare your <bundle> element once for the whole page.

Translating DataObjects

And what about DataObjects? Your data sources, which are cleanly separated from your uiXML and UIX Components code, all get a RenderingContext when they run - so they have access to the same LocaleContext as the rest of your page. If they need to present translated text, they can do so:

public class YourDataObject implements DataObject
{
  public Object getValue(RenderingContext context, Object select)
  {
    if (select.equals(_TRANSLATED_KEY))
    {
      ResourceBundle bundle =
       context.getLocaleContext().getBundle(_BUNDLE_NAME);
      return bundle.getObject(_SOME_KEY);
    }

    ...
  }
}

Overriding the Locale

For many clients, UIX's Locale defaulting will be entirely sufficient for their needs - but you may want to override this choice. In particular, some web applications make the choice of a language a preference that users can set.

If you're developing with UIX Controller, you'll just need to add a few lines of code to your PageBroker implementation:

public class YourPageBroker extends UIXPageBroker
{
  public void requestStarted(BajaContext context)
  {
    Locale locale = ...;
    context.setLocaleContext(new LocaleContext(locale));
  }

}

Pretty simple. UIX Controller will automatically pass the locale on to UIX Components's RenderingContext for you. If you're using UIX Components directly, you'll just need to call UIX Components's ServletRenderingContext.setLocaleContext() method.

Multilingual Pages

Every UIX Components page runs in only a single Locale. But this doesn't prevent you from displaying strings in languages other than the default. UIX Components will automatically escape characters for you so they can be correctly read by the browser.

There is one very important caveat. You have to use a browser that can properly display multilingual pages - Netscape 4 is very poor at this, for instance.

Message Formatting

UIX Components and uiXML both offer integrated support for the java.text.MessageFormat class - and a faster replacement that we supply, oracle.cabo.share.util.FastMessageFormat.

These classes are, again, all built around the data binding architecture and the BoundValue interface. In UIX Components, we supply the oracle.cabo.ui.data.bind.MessageFormatBoundValue class. This class will dynamically get the message format mask from one BoundValue, and get each object to substitute into that mask from an array of BoundValues. For an example, suppose our ResourceBundle has one entry:

  ...
  {"textFormat", "From {0} to {1}"}
  ...

Then this code will automatically grab two strings and insert them into that format mask - only it will get the correct format for the current locale:

  StyledTextBean styledText = new StyledTextBean();
  // Get the format mask from a ResourceBundle
  BoundValue formatMask = new BundleBoundValue(_BUNDLE_CLASS_NAME,
                                               "textFormat");
  // And get each of the two pieces that go into the mask
  // from DataObjects:
  BoundValue firstEntry = new DataBoundValue("firstSelectKey");
  BoundValue secondEntry = new DataBoundValue("secondSelectKey");
  
  // And wrap it all up in a MessageFormatBoundValue
  styledText.setTextBinding(new MessageFormatBoundValue(
                              formatMask,
                              new BoundValue[]{firstEntry,
                                               secondEntry});

This example is starting to get a bit complicated. But it is powerful!

You can get at the same functionality from inside UIX using the <messageFormat> element:


 <dataScope>
   <provider>
     <data name="nls">
       <bundle class="your.Bundle"/>
     </data>
   </provider>
   <contents>
     <styledText>
       <boundAttribute name="text">
         <messageFormat data:format="textFormat@nls">
           <dataObject select="firstSelectKey"/>
           <dataObject select="secondSelectKey"/>
         <messageFormat>
       </boundAttribute>
     </styledText>
   </contents>
 </dataScope>
A lot of text for a small bit of functionality, but The key piece to focus on here is the <boundAttribute> element and everything inside it.
  1. Recall that if you want to bind an attribute to something other than a DataObject, you need to use <boundAttribute>.
  2. The <messageFormat> element defines the general bound value. We've chosen to get the format dynamically from the resource bundle, though you can also set the "format" attribute directly.
  3. The two children each define their own BoundValues, each of which gets one piece of data to merge into the message format.

As for that FastMessageFormat class we mentioned at the top of this section: this is the formatting class we use by default. It only allows for simple index-based replacement, namely:

    some{1}text{0}here{2}andthere

...as well as escaping using single quotes. The Java built-in formatting class is very expensive, and should be avoided if possible, but both the MessageFormatBoundValue class and the UIX <messageFormat> object offer "fast" attributes. Set "fast" to false, and we'll use the more feature-rich, but slower implementation.

Character Encodings

As explained earlier, character encodings define the mapping of text into a stream of bytes.

If you use UIX Components or uiXML alone, without UIX Controller, you'll need to handle a few extra tasks on your own. You'll need to ask the ServletResponse for a Writer, and your responsibility to tell the Servlet API what encoding to use. It's also your responsibility to make sure the ServletRequest parameters are properly decoded for the correct character set.

If you're using UIX Controller, though, UIX can automate all of this. UIX's preferred, default character encoding is "UTF-8". This is a particular Unicode encoding. Like UTF-16 (the encoding used for Java strings), it supports all the characters of the Unicode character set. Unlike UTF-16, it doesn't use two or four bytes for every character. Instead, it uses between one and three bytes. For ASCII characters (zero to 127), it only uses one byte. Since most of the content in HTML pages are largely ASCII - because all HTML tags are ASCII - this encoding uses substantially less space.

To override this default in UIX Controller, you'll just need to add a few lines of code to your PageBroker implementation:

public class YourPageBroker extends UIXPageBroker
{
  public void requestStarted(BajaContext context)
  {
    String desiredEncoding = ...;
    context.getPageDecoder().setRequestCharacterEncoding(desiredEncoding);
  }
}

Again, that's it. It's critical that you make this call inside requestStarted().

One caveat: some servlet engines support automatically decoding the servlet request. For example, the Servlet 2.3 spec defines a ServletRequest.setCharacterEncoding() method. If you're using a Servlet 2.3 engine (like Tomcat 4.0), you should change that code to:

public class YourPageBroker extends UIXPageBroker
{
  // Servlet 2.3 code only!
  public void requestStarted(BajaContext context)
  {
    String desiredEncoding = ...;
    context.getServletRequest().setCharacterEncoding(desiredEncoding);
  }
}

And some servlet engines let you explicitly set the encoding with proprietary extensions of the servlet specification - OJSP for example. If you're using one of these engines, you should still override requestStarted() and call setRequestCharacterEncoding() - but pass in null, which will signal UIX Controller not to decode a second time.

It's unfortunate that this has to be complicated, but until the Servlet 2.3 specification provides consistent behavior across browsers, there's little that can be done to automatically account for the nonstandard behavior of servlet engines.

But once a character encoding has been set, all the following things happen automatically:

Note that to see the decoded characters correctly, you must use PageEvent, not ServletRequest. In a Servlet 2.3 engine, the ServletRequest will be correct, but you should still use PageEvent. It also includes parameters submitted with file upload, and performs additional decoding needed for UIX's mobile device support. And, perhaps most importantly, it's a layer of abstraction that can receive further tweaks and enhancements.

Bidirectional Languages

Using the UIX Components stack greatly simplifies working with bidirectional languages - Arabic and Hebrew. We do require that the user's browser support bidirectional languages. Internet Explorer 5 does, for instance, as does Netscape 6 and Mozilla.

UIX Components gets the default "reading direction" from LocaleUtils, using its getReadingDirection() method. LocaleUtils in turn defaults the reading direction from the locale - Arabic and Hebrew are right-to-left, all other languages are left-to-right. The signal to a browser that a page should be drawn right-to-left is the dir="rtl" attribute on the <HTML> tag. If you're using uiXML through the UIX Controller, we handle this for you. If you're using UIX Components or uiXML directly, you can use our DocumentBean or <document> element to handle this.

Once the page has been notified of direction, all you'll need to do is consistently use the correct horizontal alignment constants. In addition to the usual three supported by HTML - "left", "right", and "center" - UIX Components supports "start" and "end". These will act like "left" and "right" by default, but flip automatically in right-to-left mode. All the built-in UIX Components CSS styles will automatically flip for you, thanks to UIX Styles's XSS technology.

If you're extending our built-in XSS (see Customization), you should be careful to use the StartTextAlign and EndTextAlign properties instead of explicitly specifying text-align:

<styleSheetDocument xmlns="http://xmlns.oracle.com/uix/style">

  <import href="blaf.xss"/>

  <styleSheet>
    <!-- Bad -->
    <style name="YourBadStyle">
      <property name="text-align">right</property>
       ....
    </style>

    <!-- Good -->
    <style name="YourGoodStyle">
      <includeStyle name="EndTextAlign"/>
      <property name="text-align">right</property>
       ....
    </style>

  </styleSheet>

</styleSheetDocument>

Conclusion

While UIX handles many internationalization tasks for you, there are some that you still have to handle yourself:

And of course, you still have to write your ResourceBundles and get them translated.