javax.mail.internet
Class MimeUtility

java.lang.Object
  |
  +--javax.mail.internet.MimeUtility

public class MimeUtility
extends java.lang.Object

This is a utility class that provides various MIME related functionality.

There are a set of methods to encode and decode MIME headers as per RFC 2047. A brief description on handling such headers is given below:

RFC 822 mail headers must contain only US-ASCII characters. Headers that contain non US-ASCII characters must be encoded so that they contain only US-ASCII characters. Basically, this process involves using either BASE64 or QP to encode certain characters. RFC 2047 describes this in detail.

In Java, Strings contain (16 bit) Unicode characters. ASCII is a subset of Unicode (and occupies the range 0 - 127). A String that contains only ASCII characters is already mail-safe. If the String contains non US-ASCII characters, it must be encoded. An additional complexity in this step is that since Unicode is not yet a widely used charset, one might want to first charset-encode the String into another charset and then do the transfer-encoding.

Note that to get the actual bytes of a mail-safe String (say, for sending over SMTP), one must do


	byte[] bytes = string.getBytes("iso-8859-1");	

 

The setHeader() and addHeader() methods on MimeMessage and MimeBodyPart assume that the given header values are Unicode strings that contain only US-ASCII characters. Hence the callers of those methods must insure that the values they pass do not contain non US-ASCII characters. The methods in this class help do this.

The getHeader() family of methods on MimeMessage and MimeBodyPart return the raw header value. These might be encoded as per RFC 2047, and if so, must be decoded into Unicode Strings. The methods in this class help to do this.


Field Summary
static int ALL
           
 
Method Summary
static java.io.InputStream decode(java.io.InputStream is, java.lang.String encoding)
          Decode the given input stream.
static java.lang.String decodeText(java.lang.String etext)
          Decode "unstructured" headers, that is, headers that are defined as '*text' as per RFC 822.
static java.lang.String decodeWord(java.lang.String eword)
          The string is parsed using the rules in RFC 2047 for parsing an "encoded-word".
static java.io.OutputStream encode(java.io.OutputStream os, java.lang.String encoding)
          Wrap an encoder around the given output stream.
static java.lang.String encodeText(java.lang.String text)
          Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.
static java.lang.String encodeText(java.lang.String text, java.lang.String charset, java.lang.String encoding)
          Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.
static java.lang.String encodeWord(java.lang.String word)
          Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.
static java.lang.String encodeWord(java.lang.String word, java.lang.String charset, java.lang.String encoding)
          Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.
static java.lang.String getDefaultJavaCharset()
          Get the default charset corresponding to the system's current default locale.
static java.lang.String getEncoding(DataSource ds)
          Get the content-transfer-encoding that should be applied to the input stream of this datasource, to make it mailsafe.
static java.lang.String javaCharset(java.lang.String charset)
          Convert a MIME charset name into a valid Java charset name.
static java.lang.String mimeCharset(java.lang.String charset)
          Convert a java charset into its MIME charset name.
static java.lang.String quote(java.lang.String word, java.lang.String specials)
          A utility method to quote a word, if the word contains any characters from the specified 'specials' list.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ALL

public static final int ALL
Method Detail

getEncoding

public static java.lang.String getEncoding(DataSource ds)
Get the content-transfer-encoding that should be applied to the input stream of this datasource, to make it mailsafe.

The algorithm used here is:

Parameters:
ds - DataSource
Returns:
the encoding. This is either "7bit", "quoted-printable" or "base64"

decode

public static java.io.InputStream decode(java.io.InputStream is,
                                         java.lang.String encoding)
                                  throws MessagingException
Decode the given input stream. The Input stream returned is the decoded input stream. All the encodings defined in RFC 2045 are supported here. They include "base64", "quoted-printable", "7bit", "8bit", and "binary". In addition, "uuencode" is also supported.
Parameters:
is - input stream
encoding - the encoding of the stream.
Returns:
decoded input stream.

encode

public static java.io.OutputStream encode(java.io.OutputStream os,
                                          java.lang.String encoding)
                                   throws MessagingException
Wrap an encoder around the given output stream. All the encodings defined in RFC 2045 are supported here. They include "base64", "quoted-printable", "7bit", "8bit" and "binary". In addition, "uuencode" is also supported.
Parameters:
os - output stream
encoding - the encoding of the stream.
Returns:
output stream that applies the specified encoding.

encodeText

public static java.lang.String encodeText(java.lang.String text)
                                   throws java.io.UnsupportedEncodingException
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.

The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.

Note that this method should be used to encode only "unstructured" RFC 822 headers.

Example of usage:


  MimePart part = ...
  String rawvalue = "FooBar Mailer, Japanese version 1.1"
  try {
    // If we know for sure that rawvalue contains only US-ASCII 
    // characters, we can skip the encoding part
    part.setHeader("X-mailer", MimeUtility.encodeText(rawvalue));
  } catch (UnsupportedEncodingException e) {
    // encoding failure
  } catch (MessagingException me) {
   // setHeader() failure
  }

 

Parameters:
text - unicode string
Returns:
Unicode string containing only US-ASCII characters
Throws:
java.io.UnsupportedEncodingException - if the encoding fails

encodeText

public static java.lang.String encodeText(java.lang.String text,
                                          java.lang.String charset,
                                          java.lang.String encoding)
                                   throws java.io.UnsupportedEncodingException
Encode a RFC 822 "text" token into mail-safe form as per RFC 2047.

The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.

Note that this method should be used to encode only "unstructured" RFC 822 headers.

Parameters:
text - the header value
charset - the charset. If this parameter is null, the platform's default chatset is used.
encoding - the encoding to be used. Currently supported values are "B" and "Q". If this parameter is null, then the "Q" encoding is used if most of characters to be encoded are in the ASCII charset, otherwise "B" encoding is used.
Returns:
Unicode string containing only US-ASCII characters

decodeText

public static java.lang.String decodeText(java.lang.String etext)
                                   throws java.io.UnsupportedEncodingException
Decode "unstructured" headers, that is, headers that are defined as '*text' as per RFC 822.

The string is decoded using the algorithm specified in RFC 2047, Section 6.1.1. If the charset-conversion fails for any sequence, an UnsupportedEncodingException is thrown. If the String is not an RFC 2047 style encoded header, it is returned as-is

Example of usage:


  MimePart part = ...
  String rawvalue = null;
  String  value = null;
  try {
    if ((rawvalue = part.getHeader("X-mailer")[0]) != null)
      value = MimeUtility.decodeText(rawvalue);
  } catch (UnsupportedEncodingException e) {
      // Don't care
      value = rawvalue;
  } catch (MessagingException me) { }

  return value;

 

Parameters:
etext - the possibly encoded value
Throws:
java.io.UnsupportedEncodingException - if the charset conversion failed.

encodeWord

public static java.lang.String encodeWord(java.lang.String word)
                                   throws java.io.UnsupportedEncodingException
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.

The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the platform's default charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.

This method is meant to be used when creating RFC 822 "phrases". The InternetAddress class, for example, uses this to encode it's 'phrase' component.

Parameters:
text - unicode string
Returns:
Array of Unicode strings containing only US-ASCII characters.
Throws:
java.io.UnsupportedEncodingException - if the encoding fails

encodeWord

public static java.lang.String encodeWord(java.lang.String word,
                                          java.lang.String charset,
                                          java.lang.String encoding)
                                   throws java.io.UnsupportedEncodingException
Encode a RFC 822 "word" token into mail-safe form as per RFC 2047.

The given Unicode string is examined for non US-ASCII characters. If the string contains only US-ASCII characters, it is returned as-is. If the string contains non US-ASCII characters, it is first character-encoded using the specified charset, then transfer-encoded using either the B or Q encoding. The resulting bytes are then returned as a Unicode string containing only ASCII characters.

Parameters:
text - unicode string
charset - the MIME charset
encoding - the encoding to be used. Currently supported values are "B" and "Q". If this parameter is null, then the "Q" encoding is used if most of characters to be encoded are in the ASCII charset, otherwise "B" encoding is used.
Returns:
Unicode string containing only US-ASCII characters
Throws:
java.io.UnsupportedEncodingException - if the encoding fails

decodeWord

public static java.lang.String decodeWord(java.lang.String eword)
                                   throws ParseException,
                                          java.io.UnsupportedEncodingException
The string is parsed using the rules in RFC 2047 for parsing an "encoded-word". If the parse fails, a ParseException is thrown. Otherwise, it is transfer-decoded, and then charset-converted into Unicode. If the charset-conversion fails, an UnsupportedEncodingException is thrown.

Parameters:
eword - the possibly encoded value
Throws:
ParseException - if the string is not an encoded-word as per RFC 2047.
java.io.UnsupportedEncodingException - if the charset conversion failed.

quote

public static java.lang.String quote(java.lang.String word,
                                     java.lang.String specials)
A utility method to quote a word, if the word contains any characters from the specified 'specials' list.

The HeaderTokenizer class defines two special sets of delimiters - MIME and RFC 822.

This method is typically used during the generation of RFC 822 and MIME header fields.

Parameters:
word - word to be quoted
specials - the set of special characters
Returns:
the possibly quoted word
See Also:
HeaderTokenizer.MIME, HeaderTokenizer.RFC822

javaCharset

public static java.lang.String javaCharset(java.lang.String charset)
Convert a MIME charset name into a valid Java charset name.

Parameters:
charset - the MIME charset name
Returns:
the Java charset equivalent. If a suitable mapping is not available, the passed in charset is itself returned.

mimeCharset

public static java.lang.String mimeCharset(java.lang.String charset)
Convert a java charset into its MIME charset name.

Note that a future version of JDK (post 1.2) might provide this functionality, in which case, we may deprecate this method then.

Parameters:
charset - the JDK charset
Returns:
the MIME/IANA equivalent. If a mapping is not possible, the passed in charset itself is returned.
Since:
JavaMail 1.1

getDefaultJavaCharset

public static java.lang.String getDefaultJavaCharset()
Get the default charset corresponding to the system's current default locale.

Returns:
the default charset of the system's default locale, as a Java charset. (NOT a MIME charset)
Since:
JavaMail 1.1