users@grizzly.java.net

Re: Upload a large file without oom with Grizzly

From: Ryan Lubke <ryan.lubke_at_oracle.com>
Date: Wed, 28 Aug 2013 11:10:18 -0700

I'll be reviewing the PR today, thanks again!

Regarding the OOM: as it stands now, for each new buffer that is passed
to the SSLFilter, we allocate a buffer twice the size in order to
accommodate the encrypted result. So there's an increase.

Depending on the socket configurations of both endpoints, and how fast
the remote is reading data, it could
be the write queue is becoming too large. We do have a way to detect
this situation, but I'm pretty sure
the Grizzly internals are currently shielded here. I will see what I
can do to allow users to leverage this.



Sébastien Lorber wrote:
> Hello,
>
> I've made my pull request.
> https://github.com/AsyncHttpClient/async-http-client/pull/367
>
> With my usecase it works, the file is uploaded like before.
>
>
>
> But I didn't notice a big memory improvement.
>
> Is it possible that SSL doesn't allow to stream the body or something
> like that?
>
>
>
> In memory, I have a lot of:
> - HeapByteBuffer
> Which are hold by SSLUtils$3
> Which are hold by BufferBuffers
> Which are hold by WriteResult
> Which are hold by AsyncWriteQueueRecord
>
>
> Here is an exemple of the OOM stacktrace:
>
> java.lang.OutOfMemoryError: Java heap space
> at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
> at java.nio.ByteBuffer.allocate(ByteBuffer.java:331)
> at
> org.glassfish.grizzly.ssl.SSLUtils.allocateOutputBuffer(SSLUtils.java:342)
> at org.glassfish.grizzly.ssl.SSLBaseFilter$2.grow(SSLBaseFilter.java:117)
> at
> org.glassfish.grizzly.ssl.SSLConnectionContext.ensureBufferSize(SSLConnectionContext.java:392)
> at
> org.glassfish.grizzly.ssl.SSLConnectionContext.wrap(SSLConnectionContext.java:272)
> at
> org.glassfish.grizzly.ssl.SSLConnectionContext.wrapAll(SSLConnectionContext.java:227)
> at org.glassfish.grizzly.ssl.SSLBaseFilter.wrapAll(SSLBaseFilter.java:404)
> at
> org.glassfish.grizzly.ssl.SSLBaseFilter.handleWrite(SSLBaseFilter.java:319)
> at org.glassfish.grizzly.ssl.SSLFilter.accurateWrite(SSLFilter.java:255)
> at org.glassfish.grizzly.ssl.SSLFilter.handleWrite(SSLFilter.java:143)
> at
> com.ning.http.client.providers.grizzly.GrizzlyAsyncHttpProvider$SwitchingSSLFilter.handleWrite(GrizzlyAsyncHttpProvider.java:2503)
> at
> org.glassfish.grizzly.filterchain.ExecutorResolver$8.execute(ExecutorResolver.java:111)
> at
> org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:288)
> at
> org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:206)
> at
> org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:136)
> at
> org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:114)
> at
> org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77)
> at
> org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:853)
> at
> org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:720)
> at
> com.ning.http.client.providers.grizzly.FeedableBodyGenerator.flushQueue(FeedableBodyGenerator.java:132)
> at
> com.ning.http.client.providers.grizzly.FeedableBodyGenerator.feed(FeedableBodyGenerator.java:101)
> at
> com.ning.http.client.providers.grizzly.MultipartBodyGeneratorFeeder$FeedBodyGeneratorOutputStream.write(MultipartBodyGeneratorFeeder.java:222)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
> at com.ning.http.multipart.FilePart.sendData(FilePart.java:179)
> at com.ning.http.multipart.Part.send(Part.java:331)
> at com.ning.http.multipart.Part.sendParts(Part.java:397)
> at
> com.ning.http.client.providers.grizzly.MultipartBodyGeneratorFeeder.feed(MultipartBodyGeneratorFeeder.java:144)
>
>
>
>
> Any idea?
>
>
>
> 2013/8/27 Ryan Lubke <ryan.lubke_at_oracle.com
> <mailto:ryan.lubke_at_oracle.com>>
>
> Excellent! Looking forward to the pull request!
>
>
> Sébastien Lorber wrote:
>> Ryan thanks, it works fine, I'll make a pull request on AHC
>> tomorrow with a better code using the same Part classes that
>> already exist.
>>
>> I created an OutputStream that redirects to the BodyGenerator feeder.
>>
>> The problem I currently have is that the feeder feeds the queue
>> faster than the async thread polling it :)
>> I need to expose a limit to that queue size or something, will
>> work on that, it will be better than a thread sleep to slow down
>> the filepart reading
>>
>>
>> 2013/8/27 Ryan Lubke <ryan.lubke_at_oracle.com
>> <mailto:ryan.lubke_at_oracle.com>>
>>
>> Yes, something like that. I was going to tackle adding
>> something like this today. I'll follow up with something you
>> can test out.
>>
>>
>> Sébastien Lorber wrote:
>>> Ok thanks!
>>>
>>> I think I see what I could do, probably something like that:
>>>
>>>
>>> FeedableBodyGenerator bodyGenerator = new
>>> FeedableBodyGenerator();
>>> MultipartBodyGeneratorFeeder bodyGeneratorFeeder = new
>>> MultipartBodyGeneratorFeeder(bodyGenerator);
>>> Request uploadRequest1 = new RequestBuilder("POST")
>>> .setUrl("url")
>>> .setBody(bodyGenerator)
>>> .build();
>>>
>>> ListenableFuture<Response> asyncRes = asyncHttpClient
>>> .prepareRequest(uploadRequest1)
>>> .execute(new AsyncCompletionHandlerBase());
>>>
>>>
>>> bodyGeneratorFeeder.append("param1","value1");
>>> bodyGeneratorFeeder.append("param2","value2");
>>> bodyGeneratorFeeder.append("fileToUpload",fileInputStream);
>>> bodyGeneratorFeeder.end();
>>>
>>> Response uploadResponse = asyncRes.get();
>>>
>>>
>>> Does it seem ok to you?
>>>
>>> I guess it could be interesting to provide that
>>> MultipartBodyGeneratorFeeder class to AHC or Grizzly since
>>> some other people may want to achieve the same thing
>>>
>>>
>>>
>>>
>>>
>>> 2013/8/26 Ryan Lubke <ryan.lubke_at_oracle.com
>>> <mailto:ryan.lubke_at_oracle.com>>
>>>
>>>
>>>
>>> Sébastien Lorber wrote:
>>>
>>> Hello,
>>>
>>> I would like to know if it's possible to upload a
>>> file with AHC / Grizzly in streaming, I mean without
>>> loading the whole file bytes in memory.
>>>
>>> The default behavior seems to allocate a byte[]
>>> which contans the whole file, so it means that my
>>> server can be OOM if too many users upload a large
>>> file in the same time.
>>>
>>>
>>> I've tryied with a Heap and ByteBuffer memory
>>> managers, with reallocate=true/false but no more
>>> success.
>>>
>>> It seems the whole file content is appended wto the
>>> BufferOutputStream, and then the underlying buffer
>>> is written.
>>>
>>> At least this seems to be the case with AHC integration:
>>> https://github.com/AsyncHttpClient/async-http-client/blob/6faf1f316e5546110b0779a5a42fd9d03ba6bc15/providers/grizzly/src/main/java/org/asynchttpclient/providers/grizzly/bodyhandler/PartsBodyHandler.java
>>>
>>>
>>> So, is there a way to patch AHC to stream the file
>>> so that I could eventually consume only 20mo of heap
>>> while uploading a 500mo file?
>>> Or is this simply impossible with Grizzly?
>>> I didn't notice anything related to that in the
>>> documentation.
>>>
>>> It's possible with the FeedableBodyGenerator. But if
>>> you're tied to using Multipart uploads, you'd have to
>>> convert the multipart data to Buffers manually and send
>>> using the FeedableBodyGenerator.
>>> I'll take a closer look to see if this area can be improved.
>>>
>>>
>>> Btw in my case it is a file upload. I receive a file
>>> with CXF and have to transmit it to a storage server
>>> (like S3). CXF doesn't consume memory bevause it is
>>> streaming the large fle uploads to the file system,
>>> and then provides an input stream on that file.
>>>
>>> Thanks
>>>
>>>
>>>
>>
>