users@grizzly.java.net

Re: Performance test failing due to race condition problems

From: Jeanfrancois Arcand <Jeanfrancois.Arcand_at_Sun.COM>
Date: Wed, 12 Mar 2008 16:36:22 -0400

Simon Trudeau wrote:
> If I understand correctly what you are saying, I just need to clear the
> byteBuffer once a complete message has been received and everything
> should be ok, I won't need to set continuousExecution to false?

Right.

A+

-- Jeanfrancois


>
>
>
> Simon
>
> -----Original Message-----
> From: Jeanfrancois.Arcand_at_Sun.COM [mailto:Jeanfrancois.Arcand_at_Sun.COM]
> Sent: March-12-08 4:25 PM
> To: users_at_grizzly.dev.java.net
> Subject: Re: Performance test failing due to race condition problems
>
>
>
> Simon Trudeau wrote:
>> Thanks for taking time during your conference to reply to my post, I
>> really appreciate it since I need to be done with the TCP connection
>> part of my application very soon (I got to move on to other parts).
>>
>> So lets recap:
>>
>> Clients c1...c2500 concurrently send (by batches of 25 since the
> sending
>> operation is put inside a runnable with a 25 threads pool) a 9 bytes
>> long packet to server S which echoes it back (using an EchoFilter) to
>> the corresponding c. There, c's selectorHandler's ReadFilter (common
> to
>> all clients) executes twice or more!
>>
>> Looking at the buffer position each time the ReadFilter on the c's
>> selectorHandler executes, it seriously gets incremented! Reaching some
>> times up to position 1600!!!
>
> Right. If you don't invoke bb.clear(), the framework by default will not
>
> clear the bb for you. Inside your Filter.postExecute(), you should make
> sure the bb.clear is called (assuming you have successfully parsed the
> message you are looking at).
>
>
>> Since I share 1 selectorhandler's protocolChain for 2500 clients, I
>> guess I am always reading bytes and the filter doesn't make the
>> difference between where they are from and that is why it doesn't
>> release the thread. What do you think?
>
> It should release the thread when continuousExecution is set to false,
> and your Filter.execute() return true.
>
>
>> Also, I'm a bit concern about my readFilter, its ByteBuffer doesn't
> get
>> released between each read. I assume that this is the job of the
>> ParserProtocolFilter?
>
> released? Do you means cleared?
>
>
>> Here's a sample debug output from my application to give you an idea
> of
>> the ByteBuffer position:
>>
>> PerformanceTestFilter hasRemainig: true amount: 7436 position: 756
>> PerformanceTestFilter hasRemainig: true amount: 7436 position: 756
>> PerformanceTestFilter hasRemainig: true amount: 8156 position: 36
>> PerformanceTestFilter hasRemainig: true amount: 8156 position: 36
>> PerformanceTestFilter hasRemainig: true amount: 7931 position: 261
>> PerformanceTestFilter hasRemainig: true amount: 7931 position: 261
>> PerformanceTestFilter hasRemainig: true amount: 8050 position: 142
>> PerformanceTestFilter hasRemainig: true amount: 8050 position: 142
>> PerformanceTestFilter hasRemainig: true amount: 8174 position: 18
>> PerformanceTestFilter hasRemainig: true amount: 8174 position: 18
>>
>> Ok. That being said, I guess that setting the continuous execution to
>> false is a must do for me but I still don't really understand really
>> why. How come when continuousExecution is set to true, it executes
> twice
>> with the same data (I guess that is what happens... I'm a bit
>> confused...)
>
> Right because you need to call bb.clear(), which should fix the twice
> invocation. If it is called twice, that means you have a new message
> available so you can parse it.
>
> A+
>
> -- Jeanfrancois
>
>
>>
>>
>> Simon
>>
>> -----Original Message-----
>> From: Jeanfrancois.Arcand_at_Sun.COM [mailto:Jeanfrancois.Arcand_at_Sun.COM]
>
>> Sent: March-12-08 1:32 PM
>> To: users_at_grizzly.dev.java.net
>> Subject: Re: Performance test failing due to race condition problems
>>
>>
>> Salut,
>>
>> In between conference sessions :-)
>>
>> Simon Trudeau wrote:
>>>> I don't think that can happen, as its the TCPSelectorHandler that
>>> accept
>>>> the connection and own the SelectionKey. One way to find it is to do
>> a
>>>> System.out inside your Filter of Context.getSelectorHandler() to see
>> if
>>>> this is the right SelectorHandler that invoke your Filter.
>>> The right selectorHandler (the one from my client) is invoking my
>>> Filter. So that is ok.
>>>
>>>
>>>> I need more information :-) Can you check which TCPSelectorHandler
> is
>>>> invoking your Filter? Are the ProtocolChain shared between your
>>>> TCPSelectorHandler or you have two instances, one for each?
>>> ProtocolChain are not shared between TCPSelectorHandler, two
>> instances,
>>> one for the client and one for the server, have been created.
>>>
>>>> Could it be the continuousExecution = true that produce the twice
>>>> invocation? I suspect that's the problem.
>>> Bingo! That was the issue, setting the ReadFilter on my client to
>>> continuousExecution = false solved the problem. Could you explain to
>> me
>>> what did just happen here?
>> with continuousExecution set to true, the ProtocolChain will re-invoke
>
>> its ProtocolFilter after the last Filter.postExecute(). That prevent
>> having to release the thread, go to the SelectorHandler, and re-invoke
>
>> the ProtocolChain again.
>>
>> If you ProtocolFilter where called twice, it means some bytes have
> been
>> read again so you might have missed some read...
>>
>> Setting it to false doesn't re-invoke the ProtocolChain automatically.
>
>> One test you can do to print the bytebuffer.position when your
>> ProtocolFilter is re-invoked to see how many bytes has been read.
>>
>> A+
>>
>> -- Jeanfrancois
>>
>>
>> What does it all mean? What are the impacts
>>> (on other filters, for instance) of setting the continuousExecution =
>>> false?
>>>
>>> Thanks,
>>>
>>> Simon
>>>
>>> -----Original Message-----
>>> From: Jeanfrancois.Arcand_at_Sun.COM
> [mailto:Jeanfrancois.Arcand_at_Sun.COM]
>>> Sent: March-10-08 8:56 PM
>>> To: users_at_grizzly.dev.java.net
>>> Subject: Re: Performance test failing due to race condition problems
>>>
>>> Hi Simon,
>>>
>>> Simon Trudeau wrote:
>>>> After investigating further, I guess you are right, the server seems
>>> to
>>>> terminate before the connect method has a chance to complete. It is
>>> very
>>>> interesting because it only happens on machine with 2 or more
>> cores...
>>> I
>>>> can't reproduce it on my single core machine.
>>>>
>>>> My best bet so far is that it is related to my CountDownLatch used
> to
>>>> control my test termination and the shutdown of the Controller. From
>>> my
>>>> point of view, it looks like the client selectorHandler's
> FilterChain
>>>> gets ran twice for each messages sent to the server. That would
>>> explain
>>>> why the application always crashes when I almost reaches half my
>>> number
>>>> of connections.
>>> Could it be the continuousExecution = true that produce the twice
>>> invocation? I suspect that's the problem.
>>>
>>>
>>>> My guess is that since both client and servers selectorHandlers are
>>>> registered with the same Controller on the same machine, and since
>> the
>>>> client selectorHandler is a TCPSelectorHandler and is not binded to
> a
>>>> specific port like for the server TCPSelectorHandler, than it must
>> get
>>>> invoked when the server receives a message and when the client
>>> receives
>>>> a message! Does that make sense?
>>> I don't think that can happen, as its the TCPSelectorHandler that
>> accept
>>> the connection and own the SelectionKey. One way to find it is to do
> a
>>> System.out inside your Filter of Context.getSelectorHandler() to see
>> if
>>> this is the right SelectorHandler that invoke your Filter.
>>>
>>>
>>>> So, if all my previous assumptions are right, how do I bind a
>>>> selectorHandler on the client side to a specific set of ports (only
>>> the
>>>> ones used by my clients to receive responses)? How can that be done
>> if
>>>> the receiving port only gets dynamically assigned once the client
>>>> connection has been established?
>>> I don't think this is the problem :-)
>>>
>>>
>>>> From our previous discussions, if possible and recommended, I would
>>> like
>>>> to share one stateless selectorHandler for all clients (see
>> previously
>>>> sent source code).
>>>>
>>>> Would using one controller for all clients and a different
> controller
>>>> for the server changes anything?
>>> No.
>>>
>>>> What do you think?
>>> I need more information :-) Can you check which TCPSelectorHandler is
>
>>> invoking your Filter? Are the ProtocolChain shared between your
>>> TCPSelectorHandler or you have two instances, one for each?
>>>
>>> Thanks!
>>>
>>> -- jeanfrancois
>>>
>>>
>>>> Simon
>>>>
>>>> -----Original Message-----
>>>> From: Jeanfrancois.Arcand_at_Sun.COM
>> [mailto:Jeanfrancois.Arcand_at_Sun.COM]
>>>> Sent: March-07-08 2:37 PM
>>>> To: users_at_grizzly.dev.java.net
>>>> Subject: Re: Performance test failing due to race condition problems
>>>>
>>>> Salut,
>>>>
>>>> Simon Trudeau wrote:
>>>>> I am trying to concurrently connect to 2500 servers at a time using
>>> my
>>>>> client application. Unfortunately, I run into all sorts of
>>> instability
>>>>> issues:
>>>>>
>>>> On which platform are you. Those clients all try to connect to
> remove
>>>> server, right (no local server)?
>>>>
>>>>>
>>>>>
>>>>> After connecting my 1268th client (it may vary) to the server, I
> get
>>>> the
>>>>> following exception:
>>>>>
>>>>>
>>>>>
>>>>> Exception in thread "pool-1-thread-10"
>>>>> java.nio.channels.NotYetConnectedException
>>>>>
>>>>> at
>>>>>
> com.sun.grizzly.TCPConnectorHandler.write(TCPConnectorHandler.java:387)
>>>>> ...
>>>>>
>>>>> java.nio.channels.ClosedChannelException
>>>>>
>>>>> at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown
>>>> Source)
>>>>> at
>>>>>
> com.sun.grizzly.TCPConnectorHandler.finishConnect(TCPConnectorHandler.ja
>>>> va:565)
>>>>> at
>>>>>
> client.BtNIOClient$Connector$ClientCallBackHandler.onConnect(BtNIOClient
>>>> .java:214)
>>>>>
>>>>>
>>>>> Running the same test I obtain also:
>>>> It seems your server close the connection before you have a chance
> to
>>>> finish the connect method.
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>> After connecting my 1253th client (it may vary) to the server, I
> get
>>>> the
>>>>> following exception:
>>>>>
>>>>>
>>>>>
>>>>> java.nio.channels.ClosedChannelException
>>>>>
>>>>> at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(Unknown
>>>> Source)
>>>>> at sun.nio.ch.SocketChannelImpl.write(Unknown Source)
>>>>>
>>>>> at
>>>>>
> com.sun.grizzly.TCPConnectorHandler.write(TCPConnectorHandler.java:403)
>>>>> ...
>>>>>
>>>>> 7-Mar-2008 1:48:04 PM com.sun.grizzly.TCPConnectorHandler
>>>> configureChannel
>>>>> WARNING: setTcpNoDelay exception
>>>>>
>>>>> java.net.SocketException: Connection reset by peer:
>>>>> sun.nio.ch.Net.setIntOption
>>>>>
>>>>> at sun.nio.ch.Net.setIntOption0(Native Method)
>>>>>
>>>>> at sun.nio.ch.Net.setIntOption(Unknown Source)
>>>>>
>>>>> at sun.nio.ch.SocketChannelImpl$1.setInt(Unknown
> Source)
>>>>> at sun.nio.ch.SocketOptsImpl.setBoolean(Unknown Source)
>>>>>
>>>>> at sun.nio.ch.SocketOptsImpl$IP$TCP.noDelay(Unknown
>>>> Source)
>>>>> at sun.nio.ch.OptionAdaptor.setTcpNoDelay(Unknown
>> Source)
>>>>> at sun.nio.ch.SocketAdaptor.setTcpNoDelay(Unknown
>> Source)
>>>>> at
>>>>>
> com.sun.grizzly.TCPConnectorHandler.configureChannel(TCPConnectorHandler
>>>> .java:596)
>>>>> at
>>>>>
> com.sun.grizzly.TCPConnectorHandler.finishConnect(TCPConnectorHandler.ja
>>>> va:567)
>>>>> at
>>>>>
> client.BtNIOClient$Connector$ClientCallBackHandler.onConnect(BtNIOClient
>>>> .java:214)
>>>>> at
>>>>>
> com.sun.grizzly.CallbackHandlerContextTask.doCall(CallbackHandlerContext
>>>> Task.java:66)
>>>>> at
>>>>>
> com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.jav
>>>> a:57)
>>>>> at
>>>>>
> com.sun.grizzly.util.WorkerThreadImpl.run(WorkerThreadImpl.java:179)
>>>> That one should not be the issue. How many file descriptor you
>> machine
>>>> enable you right now:
>>>>
>>>> % unlimit -l
>>>>
>>>>>
>>>>>
>>>>> I also get for the same test
>>>>>
>>>>>
>>>>>
>>>>> After connecting my 1253th client (it may vary) to the server, I
> get
>>>> the
>>>>> following exception:
>>>>>
>>>>>
>>>>>
>>>>> Exception in thread "pool-1-thread-5"
>>> java.lang.IllegalStateException:
>>>>> SelectorHandler not yet started
>>>>>
>>>>> at
>>>>>
> com.sun.grizzly.TCPSelectorHandler.acquireConnectorHandler(TCPSelectorHa
>>>> ndler.java:778)
>>>>> at
>>>>> client.BtNIOClient$Connector.initConnector(BtNIOClient.java:181)
>>>>>
>>>>>
>>>>>
>>>>> What's weird is that it always happens around my 1250-1260 th
>>>> client...
>>>>>
>>>>>
>>>>> I am testing with both client and server using the same Controller.
>
>>>>> Tests are ran on a dual core intel machine. You need to run the
>>>>> performance test (ClientPerformanceTest.java performanceTest1()) a
>>> few
>>>>> times to get the exceptions, they don't occur on each run... which
>> is
>>>>> weird but which is also consistent with the race condition problems
>> I
>>>> am
>>>>> encountering.
>>>>>
>>>>>
>>>>>
>>>>> I have attached my full source code with test to this mail so you
>> can
>>>>> run my test and maybe some of you might help me figure out what I
>> did
>>>> wrong.
>>>>>
>>>>>
>>>>> To run the test, just include on the classpath the latest
>>>>> grizzly-framework and use java 6.
>>>> OK will try to take a look today....
>>>>
>>>> -- Jeanfrancois
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Simon
>>>>>
>>>>>
>>>>>
> ------------------------------------------------------------------------
>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>>
>>>>
>>>>
> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>