users@grizzly.java.net

Re: OutOfMemoryError with tunnel sample

From: Sam Crawford <samcrawford_at_gmail.com>
Date: Wed, 30 Nov 2011 11:48:25 +0000

Hi Alexey,

Apologies for the slow reply

I have repeated my tests using the latest TunnelServer and
TunnelFilter code supplied (I don't think it had changed) and also
Grizzly 2.2-SNAPSHOT. I have been able to reproduce the issue with
2.2-SNAPSHOT, although I note throughput has improved significantly
over 2.1.7.

One thing I missed in my previous tests is that after the OOM Error
occurs, the iperf consumer continues receiving data from Grizzly (all
of the data that is still sitting in the buffer).

I have also attached an example of the simple TCP repeater (it's very
crude) and the results that show it's able to achieve ~690Mbit/s on
both legs (client > tunnel, tunnel > server).

Interestingly, I tried setting the read/write buffer on the
TCPNIOTransport to 32KB and found that I saw an equal and much higher
throughput on both legs (610 Mbit/s), and the OOM error did not occur.
Setting it to 64KB reduced throughput slightly, but it still remained
equal and there was no OOM. Setting it to 16KB resulted in the
asymmetric throughput and OOM occurring. Note that I did not manually
set the TCP send/receive buffers in my simple socket-based example,
but I use a rather large (32KB) buffer when attempting to read from
the socket.

I suspect that different default TCP send/receive buffers on the
Windows host (Host B) coupled with the asynchronous write queue
behaviour you mentioned previously will be the cause. Of course, I'm
not advocating changing or using static values for any of these buffer
sizes - the appropriate size will depend entirely on the network and
type of traffic in use.

Thanks,

Sam


On 29 November 2011 16:24, Oleksiy Stashok <oleksiy.stashok_at_oracle.com> wrote:
> Hi Sam,
>
> can you pls. check the latest TunnelServer code on the git repo:
> http://java.net/projects/grizzly/sources/git/show/samples/framework-samples/src/main/java/org/glassfish/grizzly/samples/tunnel/
>
> (though, I'm not sure if we made any changes there since svn->git migration)
> Also can you pls. try to run iperf test using latest Grizzly 2.2-SNAPSHOT?
>
> I ran the iperf bm test on localhost and didn't see OOM error, the transfer
> rate is 2 - 2.5 Gbit/sec.
>
> Thanks.
>
> WBR,
> Alexey.
>
>
> On 11/29/2011 12:50 AM, Sam Crawford wrote:
>>
>> Hi Alexey,
>>
>> Thanks for the quick reply and your suggestion on the AsyncWriteQueue
>> - I will definitely need that.
>>
>> I think there may still be an issue here though:
>>
>> The client is indeed sending data to the tunnel server at a high rate
>> - around 700Mbit/s. The iperf consumer is only receiving data at about
>> 150Mbit/s from the Grizzly tunnel before the OOMError. As you say,
>> this demonstrates that there is a clear bottleneck somewhere between
>> the tunnel server and iperf consumer.
>>
>> In your reply you mention that the server that the "consumers<snip>
>> are not able to process data at that rate". However, if I replace the
>> Grizzly tunnel with a simple sockets-based tunnel (with a couple of
>> threads for repeating data in each direction) I am able to comfortably
>> hit 700Mbit/s between the tunnel server and the iperf consumer. This
>> suggests to me that the bottleneck is the outbound Grizzly connection,
>> and not the consumer.
>>
>> Thanks,
>>
>> Sam
>>
>>
>>
>> On 28 November 2011 22:37, Oleksiy Stashok<oleksiy.stashok_at_oracle.com>
>>  wrote:
>>>
>>> Hi Sam,
>>>
>>> On 11/28/2011 10:24 PM, Sam Crawford wrote:
>>>>
>>>> Hello,
>>>>
>>>> I'm attempting to run a basic throughput benchmark of the TunnelServer
>>>> sample
>>>>
>>>> (http://java.net/projects/grizzly/sources/svn/content/branches/2dot0/code/samples/framework-samples/src/main/java/org/glassfish/grizzly/samples/tunnel/TunnelServer.java)
>>>> and I'm running into an OutOfMemoryError after ~5 seconds.
>>>>
>>>> I'm running the TunnelServer sample (unmodified, apart from the
>>>> host/port) in Eclipse, with JVM parameters -Xmx1024m
>>>> -XX:MaxPermSize=256m. I'm using Grizzly 2.1.7
>>>>
>>>> The test is being conducted with iperf across a gigabit network.
>>>> Commands used are below:
>>>>
>>>> Server: iperf -s -i 1
>>>> Client: iperf -t 90 -i 1 -c grizzlytunnel.example.com
>>>>
>>>> Any suggestions would be appreciated.
>>>
>>> Most probably you're sending lots of data to the Tunnel server very fast.
>>> The consumers,  the server forwards data to, are not able to process data
>>> at
>>> that rate, so asynchronous write queue on TunnelServer connections is
>>> constantly growing and finally eats all the available memory.
>>>
>>> The easy way to fix this is to limit max asynchronous write queue of
>>> TunnelServer<->  Consumer connections (by default there is no limit)
>>> like:
>>>
>>>            final AsyncQueueWriter<SocketAddress>  asyncQueueWriter =
>>>                    transport.getAsyncQueueIO().getWriter();
>>>            asyncQueueWriter.setMaxPendingBytesPerConnection(queueLimit);
>>>
>>> By setting the limit - Grizzly will always check the asynchronous write
>>> queue size and if it gets overloaded - connection.write(...) will throw
>>> Exception.
>>> So you'll need to decide what to do with the connections, which are not
>>> able
>>> to operate at the required rate... close them, store received data to a
>>> file
>>> and forward it once async write queue is able to accept more data.
>>>
>>> You can register async write queue monitor to get updates on async write
>>> queue size changes, something like [1] (probably we can make this API
>>> more
>>> clear).
>>>
>>>
>>> Pls. let us know if it helped.
>>>
>>> Thanks.
>>>
>>> WBR,
>>> Alexey.
>>>
>>> [1]
>>>        final int bytesWeWantToWrite =<NN>;
>>>        final int maxAsyncWriteQueueSize =
>>> asyncQueueWriter.getMaxPendingBytesPerConnection();
>>>
>>>        final TaskQueue taskQueue = ((NIOConnection)
>>> c).getAsyncWriteQueue();
>>>
>>>        monitor = new TaskQueue.QueueMonitor() {
>>>
>>>            @Override
>>>            public boolean shouldNotify() {
>>>                // Async write queue size was changed, check if it's
>>> enough
>>> for us
>>>                return ((maxAsyncWriteQueueSize -
>>> taskQueue.spaceInBytes())
>>>>
>>>> = bytesWeWantToWrite);
>>>
>>>            }
>>>
>>>            @Override
>>>            public void onNotify() throws IOException {
>>>                 // Async write queue is ready to accept
>>> bytesWeWantToWrite
>>>                 ................ WRITE HERE .............
>>>            }
>>>        };
>>>
>>>        taskQueue.setQueueMonitor(monitor);
>>>
>>>
>>>> Thanks,
>>>>
>>>> Sam
>>>
>>>
>