users@jersey.java.net

Re: [Jersey] Hello World! and Welcome to jersey-multipart

From: Paul Sandoz <Paul.Sandoz_at_Sun.COM>
Date: Tue, 04 Nov 2008 11:44:31 +0100

On Nov 4, 2008, at 9:06 AM, Craig McClanahan wrote:
>> BTW, I just discovered this related discussion:
>> http://forums.sun.com/thread.jspa?threadID=5333195
>> This, in turn, led me to javax.mail.internet.ContentDisposition and
>> more
>> importantly javax.mail.internet.HeaderTokenizer. We should be able
>> to use
>> these fairly easily to implement the aforementioned APIs.
>>
>>
> Ugh ... that (parsing MIME headers) is something else that I didn't
> have to worry about when I was "constructively lazy" and depended on
> JavaMail for parsing. We'd definitely need to build something on
> top of the MultipartStream from Commons FileUpload, since that only
> gives you body parts as a whole, and doesn't deal with the headers.

Yes, for the @FormParam multipart support in Jersey (which we really
need to clean up and defer to the mail contribs support) i had to
parse the content disposition:

     private static Map<String, BodyPart> getFormData(MimeMultipart
mm) throws Exception {
         Map<String, BodyPart> m = new HashMap<String, BodyPart>();

         for (int i = 0; i < mm.getCount(); i++) {
             BodyPart b = mm.getBodyPart(i);
             if (b.getDisposition() != null &&
                     b.getDisposition().equalsIgnoreCase("form-data")) {
                 String name = getName(b.getHeader("content-
disposition")[0]);
                 if (name != null)
                     m.put(name, b);
             }
         }
         return m;
     }

     private static String getName(String disposition) throws
ParseException {
         HttpHeaderReader reader = new
HttpHeaderReaderImpl(disposition);
         // Skip any white space
         reader.hasNext();

         // Get the "form-data"
         reader.nextToken();

         while (reader.hasNext()) {
             reader.nextSeparator(';');

             // Ignore a ';' with no parameters
             if (!reader.hasNext())
                 break;

             // Get the parameter name
             String name = reader.nextToken();
             reader.nextSeparator('=');
             // Get the parameter value
             String value = reader.nextTokenOrQuotedString();
             if (name.equalsIgnoreCase("name")) {
                 return value;
             }
         }
         return null;
     }

I am reusing the HTTP header parsing code for parsing the MIME header.
There are some close similarities between HTTP headers and MIME
headers but there may also be some subtle differences that results
failure for edge cases.


>
>> Craig McClanahan wrote:
>>
>>> Dealing with the configuration seems solvable fairly easily,
>>> although
>>> the interesting bit is to find a solution that works for both a
>>> servlet
>>> deployment (where there could be multiple independent apps
>>> deployed in
>>> the same server, so system properties don't work well) and a non-
>>> servlet
>>> deployment (where we don't have any access to servlet APIs.
>>>
>>>
>> Why do you need to support non-servlet deployments?
>>
>>
> Two reasons:
>
> * On the server side, JAX-RS explicitly declares support for non-
> servlet deployments
> (although lots of details are left as "exercise for the
> implementor" in the 1.0 spec).
> There is no a priori reason I can think of that we should restrict
> multipart support to
> only work for servlet based server side deployments.
>
> * A lot of my use cases involve *client* applications using RESTful
> web services.
> Jersey has a very nice client API that can leverage much of the
> JAX-RS infrastructure
> on the client side as well, and it would be really weird to
> "import javax.servlet.*" into
> an applet, or a RIA app based on Swing or Java FX.
>

We could do the following:

1) Have a class in jersey core called FeatureAndProperties;

2) The ClientConfig and ResourceConfig extend this class.

3) FeaturesAndProperties can be injected.

Then message body readers/writers requiring access to config
information are independent of client/server.


>> Craig McClanahan wrote:
>>
>>> Probably the best bet is an optional properties file loaded via
>>> the thread
>>> context class loader (if any; otherwise the class loader that
>>> loaded the
>>> jersey-multipart classes) that can be used to set configuration
>>> stuff
>>> like this.
>>>
>>>
>> As soon as we start talking about class loaders things get ugly
>> real quick.
>> Ideally I'd like to cheat by simplifing the requirements if
>> possible ;)
>>
>>
>>
> I know that pain :-), having been pretty heavily involved in the
> servlet container inside Tomcat from version 4.x -- the basic
> architecture today is pretty similar although quite refined. But
> let me state a requirement I think we must satisfy in a different
> way: it must be possible to deploy two or more different server
> apps, each using jersey-multipart, in the same instance of a servlet
> container, with different configuration settings for things like the
> threshold size before a body part gets spooled to a disk file.
>
> The algorithm I described above is quite typical of the way your
> average web framework (including Struts) locates application
> specific resources. Fortunately, we can make it work transparently
> in a non-servlet world too, by using things like
> ClassLoader.getResourceAsStream() instead of
> ServletContext.getResourceAsStream().
>> Craig McClanahan wrote:
>>
>>> Regarding the need for the cleanup call, I'm open to suggestion
>>> for how
>>> to improve this. I've started looking at Jersey filters (which
>>> would
>>> impose a Jersey-specific implementation dependency), but haven't
>>> settled
>>> on anything yet.
>>>
>>>
>> FileUpload has a background thread that monitors when its File
>> instances get
>> garbage-collected (Guice does something similar) and when this
>> happens it
>> goes and deletes the files off the hard-drive automatically. It
>> also uses a
>> Servlet Filter to hook servlet shutdown to force immediate deletion
>> of all
>> outstanding files. At least, that's what I understood from their
>> Javadoc.
>>
>>
> I'm not usually a fan of "background thread cleanup" type solutions
> because they are (a) asynchronous, and therefore consume resources
> for longer than they should be consumed, and (b) they encourage
> sloppy coding -- "someone else will clean up my mess, so I don't
> have to worry about it." It might not be quite so bad in this use
> case, although we'd need to again deal separately with the servlet
> versus non-servlet case (servlet based temp files should be cleaned
> up when the app is undeployed, not just when the container is shut
> down), but that might be feasible.
>
> Unless performance considerations dictate otherwise, I'm much more
> into synchronously cleaning up after myself after each request. But
> it's definitely not optimal to make the app developer responsible
> for ensuring that this happens.


I have made changes to the life-cycle management of Jersey to support
@PostConstruct and @PreDestroy of resource classes (plan to do the
same for providers). What we could support is a way for the mail
support to register the multi-part instance it creates for clean up
when the request/response is done. One way to do this is to support a
@PerInstance life-cycle and the mail support asks Jersey to create an
instance of the MultiPart which will get destroyed when the request
goes out of scope. It should also work for Jersey clients working
within the server environment.

There is an issue with isolated clients though, and may be that is
where a thread clean up solution is appropriate and we may be able to
hide that functionality through different life-cycle implementation
support.

This does of course mean that the mail support is dependent on Jersey
specific APIs.


>
>> Craig McClanahan wrote:
>>
>>> Separately, I took your suggestion to look at Commons FileUpload.
>>> It
>>> turns out that there *is* a single important class
>>> (org.apache.commons.fileupload.MultipartStream) -- plus a couple of
>>> small helpers -- that performs the only real task I'm currently
>>> delegating to JavaMail. That's the actual parsing of the
>>> multipart/*
>>> input stream. I need to do some experiments, and take heed of the
>>> potential concerns in the javadocs about nested multipart/* body
>>> parts,
>>> but it may well be that we could incorporate a variant of just this
>>> class and not even need the entire Commons FileUpload package (or
>>> it's
>>> dependence on Commons IO).
>>>
>>>
>> The only reason I suggested looking at FileUpload is I thought it
>> was an
>> active project that it handles temporary files and configuration
>> under the
>> hood for us. Now I'm not so sure anymore. If you believe that
>> adding their
>> configuration on top of Javamail isn't much hassle I think I'd
>> prefer that
>> than relying on their code. I say this because I noticed that
>> Javamail has
>> been open-sourced and is now part of Glassfish and I trust those
>> guys a heck
>> of a lot more than I do your average Apache project.
>>
>>
> I'm afraid I'm biased both ways so can't help you much in
> determining who to trust more.
>
> I work for Sun (originally in the J2EE (now Java EE) group where
> Glassfish now comes from), and have also been heavily involved in
> many Apache projects -- most particularly Tomcat and Struts, the
> latter being where many of the Commons projects got a lot of initial
> developers and initial codebases.
>
> I trust them both :-).
>> My main concern now is stability and simplicity. I guess I no
>> longer care
>> about the underlying JAR size ;)
>>
>>
> +1 for stability and simplicity. But this is a case where the
> "reuse" argument might tends towards fork-and-copy-a-few-classes
> (with appropriate attributions, of course, to satisfy the relevant
> open source licenses), rather than importing reasonably large
> packages when we're only using a few classes from them.
>
> On the other hand, the JavaMail based support for parsing multipart
> messages "just works" ...

I would be inclined to build on JavaMail and take the fork-and-copy
approach. It should be possible to extend JavaMail for multipart/form-
data then the Jersey layers above that becomes easier to support.

Paul.