Hi Oleksiy,
finally I gor some time to test 2.2.19.
Here what I get:
org.glassfish.grizzly.PendingWriteQueueLimitExceededException: Max
queued data limit exceeded: 2999452>47440
at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.checkQueueSize(AbstractNIOAsyncQueueWriter.java:619)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.writeQueueRecord(AbstractNIOAsyncQueueWriter.java:279)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.write(AbstractNIOAsyncQueueWriter.java:219)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.nio.transport.TCPNIOTransportFilter.handleWrite(TCPNIOTransportFilter.java:127)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.TransportFilter.handleWrite(TransportFilter.java:191)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.ExecutorResolver$8.execute(ExecutorResolver.java:111)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:265)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:200)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:134)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:78)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:652)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.glassfish.grizzly.filterchain.FilterChainContext.write(FilterChainContext.java:568)
[grizzly-framework-2.2.19-SNAPSHOT.jar:2.2.19-SNAPSHOT]
at org.dcache.xdr.GrizzlyXdrTransport.send(GrizzlyXdrTransport.java:47)
[xdr-2.4.0-SNAPSHOT.jar:2.4.0-SNAPSHOT]
Shall I add a push back handler to retry write operation?
Tigran
On Thu, Aug 23, 2012 at 3:37 PM, Oleksiy Stashok
<oleksiy.stashok_at_oracle.com> wrote:
> Hi Tigran,
>
>> There is the fragment of the handler code ( to complete code can be
>> found at
>> http://code.google.com/p/nio-jrpc/source/browse/oncrpc4j-core/src/main/java/org/dcache/xdr/RpcMessageParserTCP.java):
>>
>> @Override
>> public NextAction handleRead(FilterChainContext ctx) throws
>> IOException {
>>
>> Buffer messageBuffer = ctx.getMessage();
>> if (messageBuffer == null) {
>> return ctx.getStopAction();
>> }
>>
>> if (!isAllFragmentsArrived(messageBuffer)) {
>> return ctx.getStopAction(messageBuffer);
>> }
>>
>> ctx.setMessage(assembleXdr(messageBuffer));
>>
>> final Buffer reminder = messageBuffer.hasRemaining()
>> ? messageBuffer.split(messageBuffer.position()) : null;
>>
>> return ctx.getInvokeAction(reminder);
>> }
>>
>>
>> Up to now I was sure that is there are more data to process (reminder
>> != null) or more incoming data available grizzly will process it.
>
> Right, it will, but only after current filterchain processing is finished.
>
>
>
>> The difference between SamaThread and WorkerThread strategies is only in
>> the way of processing.
>
> The diff. is that SameThreadStrategy will do FilterChain processing in the
> Grizzly core thread (selector thread for NIO), and WorkerThreadStrategy will
> run FilterChain processing in Transport's worker thread.
>
>
>> For example it for some reason processing takes
>> too long, the I can read the message and drop in without processing as
>> client will retry any will ignore late reply.
>
> If that's the case - then pls. try the approach i suggested in prev. email.
>
> WBR,
> Alexey.
>
>
>>
>> Anyway, now I now that this was a wrong assumption.
>>
>> Tigran.
>>
>>
>>> I'm not saying this will work faster, but it will really parallelize
>>> request
>>> processing.
>>>
>>> Thanks.
>>>
>>> WBR,
>>> Alexey.
>>>
>>>> Regads,
>>>> Tigran.
>>>>
>>>>> Thanks.
>>>>>
>>>>> WBR,
>>>>> Alexey.
>>>>>
>>>>>
>>>>> On 08/22/2012 02:16 PM, Tigran Mkrtchyan wrote:
>>>>>>
>>>>>> Hi Alexey,
>>>>>>
>>>>>> On Wed, Aug 22, 2012 at 11:37 AM, Oleksiy Stashok
>>>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>>>
>>>>>>> Hi Tigran,
>>>>>>>
>>>>>>>
>>>>>>> On 08/22/2012 08:03 AM, Tigran Mkrtchyan wrote:
>>>>>>>>
>>>>>>>> The result is performs 2.1 ~15% better than 2.2 and 2.3:
>>>>>>>>
>>>>>>>> grizzly 2.2 http://hammercloud.cern.ch/hc/app/atlas/test/20009667/
>>>>>>>> girzzly 2.3 http://hammercloud.cern.ch/hc/app/atlas/test/20009668/
>>>>>>>> grizzly 2.1 http://hammercloud.cern.ch/hc/app/atlas/test/20009669/
>>>>>>>>
>>>>>>>> (look at the mean of most right graph)
>>>>>>>>
>>>>>>>> The only difference in the code is attached. May be problem is
>>>>>>>> there.
>>>>>>>
>>>>>>> Thank you for the info. I don't see any problem in your code. Just
>>>>>>> FYI
>>>>>>> we're
>>>>>>> deprecating PushBack mechanism in Grizzly 2.3 (will remove in Grizzly
>>>>>>> 3).
>>>>>>> It
>>>>>>> would be still possible to check async write queue status, if it's
>>>>>>> overloaded or not, register a listener, which will be notified once
>>>>>>> you
>>>>>>> can
>>>>>>> write.... But the important difference is that async write queue will
>>>>>>> keep
>>>>>>> accepting data even if it's overloaded (no exception thrown). We
>>>>>>> think
>>>>>>> this
>>>>>>> behavior is easier to implement on Grizzly side (it will perform
>>>>>>> better)
>>>>>>> and
>>>>>>> it actually offers same/similar functionality as push back mechanism.
>>>>>>
>>>>>> I have added push-back handler as we start to see rejections. It's
>>>>>> good to hear the you will push a default implementation into async
>>>>>> write queue.
>>>>>>
>>>>>>> As for the results you ran, wanted to check
>>>>>>>
>>>>>>> 1) All three runs use the same I/O strategy (WorkerThreadStrategy) ?
>>>>>>
>>>>>> Yes, the only difference is the patch which was attached.
>>>>>>
>>>>>>> 2) Are you configuring Grizzly worker thread pool in your code?
>>>>>>
>>>>>> No we use what ever grizzly has by default.
>>>>>>
>>>>>>> I'll run simple echo test and will try to reproduce the problem, will
>>>>>>> let
>>>>>>> you know.
>>>>>>
>>>>>> Just to remind you. The client are 16 physical hosts doing NFS IO to a
>>>>>> single server.
>>>>>> Each client may ( and does ) send 16 (just a coincidence with number
>>>>>> of hosts ) request. In total server have to process 256 requests at
>>>>>> any point and reply with 16x1MB messages.
>>>>>> The server host has 24 cores and 32 GB or RAM.
>>>>>>
>>>>>> Regards,
>>>>>> Tigran.
>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> WBR,
>>>>>>> Alexey.
>>>>>>>
>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Tigran.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Aug 17, 2012 at 4:19 PM, Tigran Mkrtchyan
>>>>>>>> <tigran.mkrtchyan_at_desy.de> wrote:
>>>>>>>>>
>>>>>>>>> Hi Alexey,
>>>>>>>>>
>>>>>>>>> We had SameThreadStrategy. Now I switched to WorkerThreadStratedy.
>>>>>>>>>
>>>>>>>>> Tigran.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 17, 2012 at 4:12 PM, Oleksiy Stashok
>>>>>>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Tigran,
>>>>>>>>>>
>>>>>>>>>> thanks a lot for the info! Would be great if you can confirm these
>>>>>>>>>> results
>>>>>>>>>> next week.
>>>>>>>>>> Just interesting, are you using SameThreadStrategy in your tests?
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> WBR,
>>>>>>>>>> Alexey.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 08/17/2012 03:31 PM, Tigran Mkrtchyan wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi Alexey,
>>>>>>>>>>>
>>>>>>>>>>> the 2.3-SNAPSHOT is comparable with 2.1.11.
>>>>>>>>>>> The 2.2.9 is ~5% slower in my simple test. We can run
>>>>>>>>>>> more production level tests next week as they take ~5-6 hours per
>>>>>>>>>>> run
>>>>>>>>>>> and require special setup.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> Tigran.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Aug 17, 2012 at 2:10 PM, Tigran Mkrtchyan
>>>>>>>>>>> <tigran.mkrtchyan_at_desy.de> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> NP. give me an hour.
>>>>>>>>>>>>
>>>>>>>>>>>> Tigran.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Aug 17, 2012 at 12:46 PM, Oleksiy Stashok
>>>>>>>>>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Tigran, before you test (if you planned) releases (2.1.7;
>>>>>>>>>>>>> 2.2.9),
>>>>>>>>>>>>> wanted
>>>>>>>>>>>>> to
>>>>>>>>>>>>> ask if you can try Grizzly 2.3-SNAPSHOT first?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>> WBR,
>>>>>>>>>>>>> Alexey.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 08/17/2012 11:51 AM, Oleksiy Stashok wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Tigran,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thank you for the info.
>>>>>>>>>>>>>> We'll investigate that!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We'll appreciate any help, which will let us narrow down the
>>>>>>>>>>>>>> problem,
>>>>>>>>>>>>>> for
>>>>>>>>>>>>>> example if you have some time to try another releases
>>>>>>>>>>>>>> 2.1.7<release<2.2.9 it
>>>>>>>>>>>>>> would help a lot.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> WBR,
>>>>>>>>>>>>>> Alexey.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 08/17/2012 11:36 AM, Tigran Mkrtchyan wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> After lot of time spent on debugging we found that the
>>>>>>>>>>>>>>> change
>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>> grizzly-2.1.7 to grizzly-2.2.9
>>>>>>>>>>>>>>> drop performance of out server by 10%.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the profiling results can be found at:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://www.dcache.org/grizzly-2-1.xml
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> http://www.dcache.org/grizzly-2-2.xml
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> we run the same application against the server (just to
>>>>>>>>>>>>>>> remind,
>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> is a NFSv4.1 server written in java).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me know if you need more info.
>>>>>>>>>>>>>>> For now we will rollback to 2.1.7 version.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Tigran.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>