users@grizzly.java.net

Re: do select spin on linux

From: John Franey <jjfraney_at_gmail.com>
Date: Wed, 15 Apr 2009 22:09:28 -0400

On Tue, Apr 14, 2009 at 9:11 PM, Jeanfrancois Arcand <
Jeanfrancois.Arcand_at_sun.com> wrote:

> Salut,
>
> John Franey wrote:
>
>> Jeanfrancois,
>>
>> I'm embarrassed to say, I have found a bug in my test case code. At this
>> point, I do not have a test case for this behavior.
>>
>> My apologies.
>>
>
> No problem. What was the issue with the test? I'm still interested to
> reproduce the issue as I suspect a JDK bug.
>

Ok, well I have a simple program now that demonstrates the problem. I also
boiled it down to be make it easier to build and run from command line.

The idea is to increase the probability of the event described in the
evaluation section of http://bugs.sun.com/view_bug.do?bug_id=6693490.

proto.TReceive will use grizzly to open and close a socket at high
frequency. It is using a ReadFilter to read from the socket and another
filter to count the number of reads. After sometime, TRecieve will sleep
for about 5 seconds and then continue.

proto.UDPSender simply sends a stream of packets at TReceives read socket.

The select spin is demonstrated quickly. Start TReciver first and note the
cpu utilization; there are no packets arriving and the utilization should be
on the low side, and zero during the sleep time. Then run UDPSender for
some time, then kill it. There would again be no packets arriving and
TReceiver's cpu utilization should be on the low side, and zero during the
sleep time. However, if there is a select spin, the cpu utilization would
be higher than before UDPSender was started. WIth a single thread spinning,
one of the cores would be 100% utilized.

Now, how long should you run UDPSender for? On a dual core cpu on my
laptop, the effect is demonstrated within 30 seconds of running UDPSender.
The spin may to be correlated to the appearance of this exception from
TReceiver:

Apr 16, 2009 1:57:50 AM com.sun.grizzly.Controller handleSelectException
SEVERE: Selector was unexpectedly closed.


So, here is how you can run it:

1) unjar the attachment and build using maven.
2) get the class path using maven: mvn dependency:build-classpath
3) run TReceiver using that classpath:
  java -cp <classpath from above>:target/classes proto.TReceiver
4) run UDPSender (no special classpath needed)
  java -cp target/classes proto.UDPSender

Regards,
John



> Thanks!
>
> -- Jeanfrancois
>
>
>> John
>>
>>
>> On Tue, Apr 14, 2009 at 10:14 AM, Jeanfrancois Arcand <
>> Jeanfrancois.Arcand_at_sun.com <mailto:Jeanfrancois.Arcand_at_sun.com>> wrote:
>>
>> Salut,
>>
>> back online
>>
>>
>> John Franey wrote:
>>
>> Jeanfrancois,
>>
>> Just a couple of clarifications:
>>
>> 1) Running from the command line needs the jar files too. Sorry
>> I gave the impression that it was too easy. I think its easiest
>> to run from IDE, especially since a debugger is needed. I hope
>> that is ok.
>>
>>
>> OK I will try to reproduce the issue but I usually don't use IDE (I
>> use vim :-)).
>>
>>
>>
>>
>> 2) Also, the BReceiver won't be idle during the rest sleeps.
>> Sorry about that. Instead, once the exception occurs, simply
>> kill the UDPSender process, then you will be sure the BReceiver
>> is not receiving any new packets.
>>
>>
>> OK. BTW I've talked to the NIO/JDK lead and nothing about spinning
>> selector has been reported on jdk 4/5/6 so we might need to closely
>> look at your code.
>>
>> Thanks
>>
>> -- Jeanfrancois
>>
>>
>>
>> Regards,
>> John
>>
>>
>>
>> On Thu, Apr 9, 2009 at 2:31 PM, John Franey <jjfraney_at_gmail.com
>> <mailto:jjfraney_at_gmail.com> <mailto:jjfraney_at_gmail.com
>>
>> <mailto:jjfraney_at_gmail.com>>> wrote:
>>
>> Jeanfrancois,
>>
>> I got lucky, here is a test case. (gmail didn't let me send
>> it as
>> zip, so its a jar).
>>
>> this is a maven project, I guess you can build and run.
>>
>> There are two main classes.
>>
>> A receiver, BReceiver, and a sender, UDPSender.
>>
>> I run these from eclipse. I guess your IDE will let your run
>> them
>> easily, but I guess from the command line something easy like:
>> java -cp target/classes grizzlyproto/BReceiver would run.
>>
>> BReciever cycles a single udp port through many connections.
>> UDPSender sends datagrams to that port at high rate.
>>
>> Periodically, BReceiver pauses at a time when there should be
>> no cpu
>> charges to that process. Not sure this helps, but the
>> intention was
>> that if the process is using cpu at this time, then the
>> selector is
>> spinning.
>>
>> Within a few moment, BReceiver will get an exception (I do
>> not know
>> if the exception is related to the problem). Then you can set
>> breakpoint on Controller.doSelect's return from
>> selectorHandler.select (at line 329) to inspect value of
>> readyKeys. Its empty.
>>
>>
>> Here are my system specificaitons (from ubuntu 8.10)
>> Linux isidore 2.6.27-11-generic #1 SMP Thu Jan 29 19:28:32
>> UTC 2009
>> x86_64 GNU/Linux
>>
>>
>> I get the same behavior on both of these:
>>
>> java version "1.6.0_12"
>> Java(TM) SE Runtime Environment (build 1.6.0_12-b04)
>> Java HotSpot(TM) 64-Bit Server VM (build 11.2-b01, mixed mode)
>>
>> and
>>
>> java version "1.6.0_03"
>> Java(TM) SE Runtime Environment (build 1.6.0_03-b05)
>> BEA JRockit(R) (build
>> R27.5.0-110-94909-1.6.0_03-20080204-1558-linux-x86_64,
>> compiled mode)
>>
>>
>> Regards,
>> John
>>
>>
>>
>> On Thu, Apr 9, 2009 at 2:00 PM, John Franey
>> <jjfraney_at_gmail.com <mailto:jjfraney_at_gmail.com>
>> <mailto:jjfraney_at_gmail.com <mailto:jjfraney_at_gmail.com>>> wrote:
>>
>> Thanks.
>>
>>
>> On Thu, Apr 9, 2009 at 1:02 PM, Jeanfrancois Arcand
>> <Jeanfrancois.Arcand_at_sun.com
>> <mailto:Jeanfrancois.Arcand_at_sun.com>
>> <mailto:Jeanfrancois.Arcand_at_sun.com
>> <mailto:Jeanfrancois.Arcand_at_sun.com>>> wrote:
>>
>> Salut,
>>
>>
>> John Franey wrote:
>>
>> Hi,
>>
>> I'm running stress tests on linux. At some
>> point, cpu
>> utilization goes high. I've verified using the
>> debugger
>> that Controller.doSelect gets zero keys back from
>> handler's select.
>>
>>
>> Yes this is an issue I've reported awhile ago to the JDK
>> team. This sounds like bugparade 6403933
>>
>>
>> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
>>
>> A connection is reset causes the file descriptor to be
>> selected with events POLLHUP and POLLERR but unless
>> there is
>> data waiting to be read then channel will not be
>> selected.
>> It's a awkward problem.
>>
>> This has been fixed in JDK 7 build 50 and will be
>> included
>> in JDK 6u16.
>>
>>
>> The last statement in the evaluation of this bug nails
>> it, and
>> the load pattern described in this bug matches mine:
>> http://bugs.sun.com/view_bug.do?bug_id=6693490
>>
>> It sounds like the same root cause?
>>
>>
>>
>> Google search led me to this page at apache mina
>> jira
>> DIRMINA-678
>> <https://issues.apache.org/jira/browse/DIRMINA-678
>> >.
>>
>> Please see especially, this comment from Emannual
>> Lecharmy (31/Mar/09 03:23 PM)
>> <
>> https://issues.apache.org/jira/browse/DIRMINA-678?focusedCommentId=12696711&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696711
>> <
>> https://issues.apache.org/jira/browse/DIRMINA-678?focusedCommentId=12696711&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696711
>> >
>> <
>> https://issues.apache.org/jira/browse/DIRMINA-678?focusedCommentId=12696711&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696711
>> <
>> https://issues.apache.org/jira/browse/DIRMINA-678?focusedCommentId=12696711&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696711
>> >>>.
>>
>>
>> So, I'm looking for this 'beautiful hack' in the
>> Controller source code for 1.9.11 and am not
>> seeing it.
>> I think maybe I'm blind :) or the hack is in
>> grizzly
>> 2.0? If its in 2.0, do you think I can make it
>> fit into
>> 1.9.11? Or do you advise I try my application on
>> 2.0 trunk?
>>
>>
>> I didn't put the hack there (just for Comet) as I
>> never saw
>> it on the Controller. Can you share a test case?
>>
>>
>> I don't have a test case and am working on one. I would be
>> surprised if I can get it to happen without a lot of work.
>>
>> Is there a code sample from comet that has this hack in it
>> to
>> see if I can make the same change for my app?
>>
>> Thanks
>>
>> -- Jeanfrancois
>>
>>
>>
>> Thanks,
>> John
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> users-unsubscribe_at_grizzly.dev.java.net
>> <mailto:users-unsubscribe_at_grizzly.dev.java.net>
>> <mailto:users-unsubscribe_at_grizzly.dev.java.net
>> <mailto:users-unsubscribe_at_grizzly.dev.java.net>>
>>
>> For additional commands, e-mail:
>> users-help_at_grizzly.dev.java.net
>> <mailto:users-help_at_grizzly.dev.java.net>
>> <mailto:users-help_at_grizzly.dev.java.net
>> <mailto:users-help_at_grizzly.dev.java.net>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
>> <mailto:users-unsubscribe_at_grizzly.dev.java.net>
>> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>> <mailto:users-help_at_grizzly.dev.java.net>
>>
>>
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe_at_grizzly.dev.java.net
> For additional commands, e-mail: users-help_at_grizzly.dev.java.net
>
>