Re: grizzly 2.2 performance issue

From: Tigran Mkrtchyan <tigran.mkrtchyan_at_desy.de>
Date: Thu, 23 Aug 2012 13:13:30 +0200

Hi Oleksiy,

looks like I have fundamental misunderstanding how grizzly works.

On Thu, Aug 23, 2012 at 11:34 AM, Oleksiy Stashok
<oleksiy.stashok_at_oracle.com> wrote:
> Hi Tigran,
>
>
>>> 4) In your test you don't multiplex (not sure if NFS allows that) over
>>> single connection? In other words you don't try to write to a single
>>> Connection from several threads simultaneously?
>>>
>> I hope we multiplex ( I hope grizzly does it for me!). My
>> understanding of grizzly is that even with a single client I can get
>> multiple threads processing several requests simultaneously as client
>> sends them multiple in the row , even with in a single TCP packet.
>
> Well, it's something Grizzly can't do itself, because you have to parse
> request(s) first.
> Once you parsed it - it's up to you whether you want to process it in the
> same or separate thread.
>
>
>> I think grizzly passes requests from all connection to a single thread
>> pool of workers and they processed independently of each other by next
>> free worker thread. Am I wrong?
>
> Yes, incoming requests from different connections are processed (if you use
> worker thread strategy) in worker threads independently.
> But requests coming on single connection can't be multiplexed by Grizzly
> itself.
>
> You may want to switch back to SameThreadStrategy. So you can parse requests
> in the selector thread and once it's parsed - dispatch it to a worker
> thread. So all requests (even if they came on the same connection) will be
> processed in parallel. Schematically it may look like:
>
> NfsCodecFilter {
>
> handleRead(FilterChainContext ctx) {
> Connection connection = ctx.getConnection();
> Buffer b = ctx.getMessage();
>
> Object nfsRequest;
>
> while ((nfsRequest = parseRequest(b)) != null) {
> workerThreadPool.execute(new
> ProcessorTask(connection, nfsRequest);
> }
>
> Buffer remainder;
> if (!b.hasRemaining()) {
> remainder = null;
> } else {
> remainder = b.split(b.position());
> }
> b.tryDispose();
>
> return ctx.getStopAction(remainder);
> }
> }
>

There is the fragment of the handler code ( to complete code can be
found at http://code.google.com/p/nio-jrpc/source/browse/oncrpc4j-core/src/main/java/org/dcache/xdr/RpcMessageParserTCP.java):

    @Override
    public NextAction handleRead(FilterChainContext ctx) throws IOException {

        Buffer messageBuffer = ctx.getMessage();
        if (messageBuffer == null) {
            return ctx.getStopAction();
        }

        if (!isAllFragmentsArrived(messageBuffer)) {
            return ctx.getStopAction(messageBuffer);
        }

        ctx.setMessage(assembleXdr(messageBuffer));

        final Buffer reminder = messageBuffer.hasRemaining()
                ? messageBuffer.split(messageBuffer.position()) : null;

        return ctx.getInvokeAction(reminder);
    }

Up to now I was sure that is there are more data to process (reminder
!= null) or more incoming data available grizzly will process it. The
difference between SamaThread and WorkerThread strategies is only in
the way of processing. For example it for some reason processing takes
too long, the I can read the message and drop in without processing as
client will retry any will ignore late reply.

Anyway, now I now that this was a wrong assumption.

Tigran.

>
> I'm not saying this will work faster, but it will really parallelize request
> processing.
>
> Thanks.
>
> WBR,
> Alexey.
>
>>
>> Regads,
>> Tigran.
>>
>>> Thanks.
>>>
>>> WBR,
>>> Alexey.
>>>
>>>
>>> On 08/22/2012 02:16 PM, Tigran Mkrtchyan wrote:
>>>>
>>>> Hi Alexey,
>>>>
>>>> On Wed, Aug 22, 2012 at 11:37 AM, Oleksiy Stashok
>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>
>>>>> Hi Tigran,
>>>>>
>>>>>
>>>>> On 08/22/2012 08:03 AM, Tigran Mkrtchyan wrote:
>>>>>>
>>>>>> The result is performs 2.1 ~15% better than 2.2 and 2.3:
>>>>>>
>>>>>> grizzly 2.2 http://hammercloud.cern.ch/hc/app/atlas/test/20009667/
>>>>>> girzzly 2.3 http://hammercloud.cern.ch/hc/app/atlas/test/20009668/
>>>>>> grizzly 2.1 http://hammercloud.cern.ch/hc/app/atlas/test/20009669/
>>>>>>
>>>>>> (look at the mean of most right graph)
>>>>>>
>>>>>> The only difference in the code is attached. May be problem is there.
>>>>>
>>>>> Thank you for the info. I don't see any problem in your code. Just FYI
>>>>> we're
>>>>> deprecating PushBack mechanism in Grizzly 2.3 (will remove in Grizzly
>>>>> 3).
>>>>> It
>>>>> would be still possible to check async write queue status, if it's
>>>>> overloaded or not, register a listener, which will be notified once you
>>>>> can
>>>>> write.... But the important difference is that async write queue will
>>>>> keep
>>>>> accepting data even if it's overloaded (no exception thrown). We think
>>>>> this
>>>>> behavior is easier to implement on Grizzly side (it will perform
>>>>> better)
>>>>> and
>>>>> it actually offers same/similar functionality as push back mechanism.
>>>>
>>>> I have added push-back handler as we start to see rejections. It's
>>>> good to hear the you will push a default implementation into async
>>>> write queue.
>>>>
>>>>> As for the results you ran, wanted to check
>>>>>
>>>>> 1) All three runs use the same I/O strategy (WorkerThreadStrategy) ?
>>>>
>>>> Yes, the only difference is the patch which was attached.
>>>>
>>>>> 2) Are you configuring Grizzly worker thread pool in your code?
>>>>
>>>> No we use what ever grizzly has by default.
>>>>
>>>>> I'll run simple echo test and will try to reproduce the problem, will
>>>>> let
>>>>> you know.
>>>>
>>>> Just to remind you. The client are 16 physical hosts doing NFS IO to a
>>>> single server.
>>>> Each client may ( and does ) send 16 (just a coincidence with number
>>>> of hosts ) request. In total server have to process 256 requests at
>>>> any point and reply with 16x1MB messages.
>>>> The server host has 24 cores and 32 GB or RAM.
>>>>
>>>> Regards,
>>>> Tigran.
>>>>
>>>>> Thanks!
>>>>>
>>>>> WBR,
>>>>> Alexey.
>>>>>
>>>>>
>>>>>> Regards,
>>>>>> Tigran.
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 17, 2012 at 4:19 PM, Tigran Mkrtchyan
>>>>>> <tigran.mkrtchyan_at_desy.de> wrote:
>>>>>>>
>>>>>>> Hi Alexey,
>>>>>>>
>>>>>>> We had SameThreadStrategy. Now I switched to WorkerThreadStratedy.
>>>>>>>
>>>>>>> Tigran.
>>>>>>>
>>>>>>> On Fri, Aug 17, 2012 at 4:12 PM, Oleksiy Stashok
>>>>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>>>>
>>>>>>>> Hi Tigran,
>>>>>>>>
>>>>>>>> thanks a lot for the info! Would be great if you can confirm these
>>>>>>>> results
>>>>>>>> next week.
>>>>>>>> Just interesting, are you using SameThreadStrategy in your tests?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> WBR,
>>>>>>>> Alexey.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 08/17/2012 03:31 PM, Tigran Mkrtchyan wrote:
>>>>>>>>>
>>>>>>>>> Hi Alexey,
>>>>>>>>>
>>>>>>>>> the 2.3-SNAPSHOT is comparable with 2.1.11.
>>>>>>>>> The 2.2.9 is ~5% slower in my simple test. We can run
>>>>>>>>> more production level tests next week as they take ~5-6 hours per
>>>>>>>>> run
>>>>>>>>> and require special setup.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Tigran.
>>>>>>>>>
>>>>>>>>> On Fri, Aug 17, 2012 at 2:10 PM, Tigran Mkrtchyan
>>>>>>>>> <tigran.mkrtchyan_at_desy.de> wrote:
>>>>>>>>>>
>>>>>>>>>> NP. give me an hour.
>>>>>>>>>>
>>>>>>>>>> Tigran.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 17, 2012 at 12:46 PM, Oleksiy Stashok
>>>>>>>>>> <oleksiy.stashok_at_oracle.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Tigran, before you test (if you planned) releases (2.1.7; 2.2.9),
>>>>>>>>>>> wanted
>>>>>>>>>>> to
>>>>>>>>>>> ask if you can try Grizzly 2.3-SNAPSHOT first?
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> WBR,
>>>>>>>>>>> Alexey.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 08/17/2012 11:51 AM, Oleksiy Stashok wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Tigran,
>>>>>>>>>>>>
>>>>>>>>>>>> thank you for the info.
>>>>>>>>>>>> We'll investigate that!
>>>>>>>>>>>>
>>>>>>>>>>>> We'll appreciate any help, which will let us narrow down the
>>>>>>>>>>>> problem,
>>>>>>>>>>>> for
>>>>>>>>>>>> example if you have some time to try another releases
>>>>>>>>>>>> 2.1.7<release<2.2.9 it
>>>>>>>>>>>> would help a lot.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> WBR,
>>>>>>>>>>>> Alexey.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 08/17/2012 11:36 AM, Tigran Mkrtchyan wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> After lot of time spent on debugging we found that the change
>>>>>>>>>>>>> from
>>>>>>>>>>>>> grizzly-2.1.7 to grizzly-2.2.9
>>>>>>>>>>>>> drop performance of out server by 10%.
>>>>>>>>>>>>>
>>>>>>>>>>>>> the profiling results can be found at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://www.dcache.org/grizzly-2-1.xml
>>>>>>>>>>>>> and
>>>>>>>>>>>>> http://www.dcache.org/grizzly-2-2.xml
>>>>>>>>>>>>>
>>>>>>>>>>>>> we run the same application against the server (just to remind,
>>>>>>>>>>>>> this
>>>>>>>>>>>>> is a NFSv4.1 server written in java).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me know if you need more info.
>>>>>>>>>>>>> For now we will rollback to 2.1.7 version.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Tigran.
>>>>>>>>>>>>
>>>>>>>>>>>>
>