dev@glassfish.java.net

Re: Server is up, but asadmin restart-domain times out

From: Ming Zhang <ming.zhang_at_oracle.com>
Date: Wed, 30 Jun 2010 10:58:49 -0700

I have verified with the latest continuous build and integrated the
RestartDomainTests to QL.
Thanks,
Ming

On 6/29/2010 11:29 PM, Jane Young wrote:
> Hi Ming,
>
> Please commit the fix in QL. Byron has fix the issue.
>
> Thanks,
> Jane
>
>
> Jane Young wrote:
>> Thanks for fixing QL.
>> Let's wait until issue is resolved.
>>
>> Thanks,
>> Jane
>>
>>
>> Ming Zhang wrote:
>>> Hi Jane,
>>>
>>> I have transferred the restart-server targets to restartDomainTest
>>> in QL. But if I integrate the test now, QL will fail on it and
>>> hudson continuous build will be red. Should I wait until the issue
>>> 12420 to be resolved? Please let me know.
>>>
>>> Thanks,
>>> Ming
>>>
>>> On 6/29/2010 2:49 PM, Ming Zhang wrote:
>>>> I have filed issue 12420 for the problem related to "asadmin
>>>> restart-domain" command.
>>>>
>>>> The current "restart-server-unix" or "restart-server-windows"
>>>> targets in QL are not tests since they don't report status. They
>>>> were checked in without my review. Next time, please let me know
>>>> when anyone checks in targets to the top level build scripts since
>>>> they affect the whole QL. Meanwhile, I'll try to create a test for
>>>> restart-domain.
>>>>
>>>> Thanks,
>>>> Ming
>>>>
>>>> On 6/29/2010 2:06 PM, Jane Young wrote:
>>>>> Sahoo,
>>>>>
>>>>> Ming is looking at fixing the QL test to fail with the
>>>>> restart-domain command.
>>>>> If he commits the fix for QL, I will revert HK2 1.0.26
>>>>> integration since QL tests will start failing.
>>>>>
>>>>> Thanks,
>>>>> Jane
>>>>>
>>>>>
>>>>> Amy Roh wrote:
>>>>>> I've seen this running QL also. Web devtests [1] fail ~50% due
>>>>>> to failing to restart. However, when I check, the server is
>>>>>> actually running.
>>>>>>
>>>>>> startDomainUnix:
>>>>>> [echo] Starting DAS, ENABLE_REPLICATION=false
>>>>>> [exec] Error starting domain: domain1. It didn't start in
>>>>>> 600 seconds
>>>>>> [exec] Waiting for the server to start

>>>>>>
>>>>>> [exec] Command start-domain failed.
>>>>>>
>>>>>> [1] http://hudson.sfbay.sun.com/job/webtier-dev-tests-v3
>>>>>>
>>>>>> Sanjeeb Sahoo wrote:
>>>>>>> This is interesting. My QL test failed to detect that server has
>>>>>>> restarted fine. Given below is the QL output...
>>>>>>>
>>>>>>> restart-server-unix:
>>>>>>> [echo] restarting server
>>>>>>> [exec] Timed out waiting for the server to restart
>>>>>>> [exec] Command restart-domain failed.
>>>>>>> [exec] Result: 1
>>>>>>> [exec] Waiting for the domain to stop .
>>>>>>> [exec] Command stop-domain executed successfully.
>>>>>>>
>>>>>>>
>>>>>>> While it was waiting for the server to restart, I ran a jps and
>>>>>>> found the following Java processes running:
>>>>>>>
>>>>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jps
>>>>>>> 23093 Jps
>>>>>>> 20153 DerbyControl
>>>>>>> 20033 Launcher
>>>>>>> 22489 admin-cli.jar
>>>>>>> 10312 Main
>>>>>>> 22538 ASMain
>>>>>>>
>>>>>>> What surprised me was that the server was actually up. I could
>>>>>>> load admin console and run admin commands. For some reason,
>>>>>>> restart-domain failed to detect the same. I checked the pid file
>>>>>>> in domain1/config/ and that contained the right value. jstack
>>>>>>> output for admin-cli.jar is shown below:
>>>>>>>
>>>>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jstack 22489
>>>>>>> 2010-06-30 01:19:55
>>>>>>> Full thread dump Java HotSpot(TM) Server VM (14.2-b01 mixed mode):
>>>>>>>
>>>>>>> "Attach Listener" daemon prio=10 tid=0x085d1c00 nid=0x5a57
>>>>>>> waiting on condition [0x00000000]
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>
>>>>>>> "Low Memory Detector" daemon prio=10 tid=0x7fd15c00 nid=0x57e7
>>>>>>> runnable [0x00000000]
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>
>>>>>>> "CompilerThread1" daemon prio=10 tid=0x7fd13800 nid=0x57e6
>>>>>>> waiting on condition [0x00000000]
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>
>>>>>>> "CompilerThread0" daemon prio=10 tid=0x7fd12000 nid=0x57e5
>>>>>>> waiting on condition [0x00000000]
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>
>>>>>>> "Signal Dispatcher" daemon prio=10 tid=0x7fd10800 nid=0x57e4
>>>>>>> runnable [0x00000000]
>>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>>
>>>>>>> "Finalizer" daemon prio=10 tid=0x7fd00800 nid=0x57e3 in
>>>>>>> Object.wait() [0x7fe96000]
>>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>> - waiting on <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>>>>>>> - locked <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>>>>>>> at
>>>>>>> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>>>>>>>
>>>>>>> "Reference Handler" daemon prio=10 tid=0x08343400 nid=0x57e2 in
>>>>>>> Object.wait() [0x7fee7000]
>>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>>> at java.lang.Object.wait(Native Method)
>>>>>>> - waiting on <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>>>>> at java.lang.Object.wait(Object.java:485)
>>>>>>> at
>>>>>>> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>>>>>>> - locked <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>>>>>
>>>>>>> "main" prio=10 tid=0x082c3000 nid=0x57de waiting on condition
>>>>>>> [0xb6aea000]
>>>>>>> java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>>>> at java.lang.Thread.sleep(Native Method)
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.LocalServerCommand.waitForRestart(LocalServerCommand.java:307)
>>>>>>>
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.RestartDomainCommand.doCommand(RestartDomainCommand.java:87)
>>>>>>>
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.StopDomainCommand.executeCommand(StopDomainCommand.java:130)
>>>>>>>
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.CLICommand.execute(CLICommand.java:255)
>>>>>>>
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.AsadminMain.executeCommand(AsadminMain.java:229)
>>>>>>>
>>>>>>> at
>>>>>>> com.sun.enterprise.admin.cli.AsadminMain.main(AsadminMain.java:167)
>>>>>>>
>>>>>>> "VM Thread" prio=10 tid=0x0833f400 nid=0x57e1 runnable
>>>>>>>
>>>>>>> "GC task thread#0 (ParallelGC)" prio=10 tid=0x082ca000
>>>>>>> nid=0x57df runnable
>>>>>>>
>>>>>>> "GC task thread#1 (ParallelGC)" prio=10 tid=0x082cb400
>>>>>>> nid=0x57e0 runnable
>>>>>>>
>>>>>>> "VM Periodic Task Thread" prio=10 tid=0x7fd17c00 nid=0x57e8
>>>>>>> waiting on condition
>>>>>>>
>>>>>>> JNI global references: 1199
>>>>>>>
>>>>>>> Has anyone notices such behavior?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sahoo
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>