dev@glassfish.java.net

Re: Server is up, but asadmin restart-domain times out

From: Ming Zhang <ming.zhang_at_oracle.com>
Date: Tue, 29 Jun 2010 18:02:53 -0700

Hi Jane,

I have transferred the restart-server targets to restartDomainTest in
QL. But if I integrate the test now, QL will fail on it and hudson
continuous build will be red. Should I wait until the issue 12420 to be
resolved? Please let me know.

Thanks,
Ming

On 6/29/2010 2:49 PM, Ming Zhang wrote:
> I have filed issue 12420 for the problem related to "asadmin
> restart-domain" command.
>
> The current "restart-server-unix" or "restart-server-windows" targets
> in QL are not tests since they don't report status. They were checked
> in without my review. Next time, please let me know when anyone checks
> in targets to the top level build scripts since they affect the whole
> QL. Meanwhile, I'll try to create a test for restart-domain.
>
> Thanks,
> Ming
>
> On 6/29/2010 2:06 PM, Jane Young wrote:
>> Sahoo,
>>
>> Ming is looking at fixing the QL test to fail with the restart-domain
>> command.
>> If he commits the fix for QL, I will revert HK2 1.0.26 integration
>> since QL tests will start failing.
>>
>> Thanks,
>> Jane
>>
>>
>> Amy Roh wrote:
>>> I've seen this running QL also. Web devtests [1] fail ~50% due to
>>> failing to restart. However, when I check, the server is actually
>>> running.
>>>
>>> startDomainUnix:
>>> [echo] Starting DAS, ENABLE_REPLICATION=false
>>> [exec] Error starting domain: domain1. It didn't start in 600
>>> seconds
>>> [exec] Waiting for the server to start
>>> ......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>>>
>>> [exec] Command start-domain failed.
>>>
>>> [1] http://hudson.sfbay.sun.com/job/webtier-dev-tests-v3
>>>
>>> Sanjeeb Sahoo wrote:
>>>> This is interesting. My QL test failed to detect that server has
>>>> restarted fine. Given below is the QL output...
>>>>
>>>> restart-server-unix:
>>>> [echo] restarting server
>>>> [exec] Timed out waiting for the server to restart
>>>> [exec] Command restart-domain failed.
>>>> [exec] Result: 1
>>>> [exec] Waiting for the domain to stop .
>>>> [exec] Command stop-domain executed successfully.
>>>>
>>>>
>>>> While it was waiting for the server to restart, I ran a jps and
>>>> found the following Java processes running:
>>>>
>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jps
>>>> 23093 Jps
>>>> 20153 DerbyControl
>>>> 20033 Launcher
>>>> 22489 admin-cli.jar
>>>> 10312 Main
>>>> 22538 ASMain
>>>>
>>>> What surprised me was that the server was actually up. I could load
>>>> admin console and run admin commands. For some reason,
>>>> restart-domain failed to detect the same. I checked the pid file in
>>>> domain1/config/ and that contained the right value. jstack output
>>>> for admin-cli.jar is shown below:
>>>>
>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jstack 22489
>>>> 2010-06-30 01:19:55
>>>> Full thread dump Java HotSpot(TM) Server VM (14.2-b01 mixed mode):
>>>>
>>>> "Attach Listener" daemon prio=10 tid=0x085d1c00 nid=0x5a57 waiting
>>>> on condition [0x00000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>>
>>>> "Low Memory Detector" daemon prio=10 tid=0x7fd15c00 nid=0x57e7
>>>> runnable [0x00000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>>
>>>> "CompilerThread1" daemon prio=10 tid=0x7fd13800 nid=0x57e6 waiting
>>>> on condition [0x00000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>>
>>>> "CompilerThread0" daemon prio=10 tid=0x7fd12000 nid=0x57e5 waiting
>>>> on condition [0x00000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>>
>>>> "Signal Dispatcher" daemon prio=10 tid=0x7fd10800 nid=0x57e4
>>>> runnable [0x00000000]
>>>> java.lang.Thread.State: RUNNABLE
>>>>
>>>> "Finalizer" daemon prio=10 tid=0x7fd00800 nid=0x57e3 in
>>>> Object.wait() [0x7fe96000]
>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>> at java.lang.Object.wait(Native Method)
>>>> - waiting on <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>>>> - locked <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>>>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>>>>
>>>> "Reference Handler" daemon prio=10 tid=0x08343400 nid=0x57e2 in
>>>> Object.wait() [0x7fee7000]
>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>> at java.lang.Object.wait(Native Method)
>>>> - waiting on <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>> at java.lang.Object.wait(Object.java:485)
>>>> at
>>>> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>>>> - locked <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>>
>>>> "main" prio=10 tid=0x082c3000 nid=0x57de waiting on condition
>>>> [0xb6aea000]
>>>> java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>> at java.lang.Thread.sleep(Native Method)
>>>> at
>>>> com.sun.enterprise.admin.cli.LocalServerCommand.waitForRestart(LocalServerCommand.java:307)
>>>>
>>>> at
>>>> com.sun.enterprise.admin.cli.RestartDomainCommand.doCommand(RestartDomainCommand.java:87)
>>>>
>>>> at
>>>> com.sun.enterprise.admin.cli.StopDomainCommand.executeCommand(StopDomainCommand.java:130)
>>>>
>>>> at
>>>> com.sun.enterprise.admin.cli.CLICommand.execute(CLICommand.java:255)
>>>> at
>>>> com.sun.enterprise.admin.cli.AsadminMain.executeCommand(AsadminMain.java:229)
>>>>
>>>> at
>>>> com.sun.enterprise.admin.cli.AsadminMain.main(AsadminMain.java:167)
>>>>
>>>> "VM Thread" prio=10 tid=0x0833f400 nid=0x57e1 runnable
>>>>
>>>> "GC task thread#0 (ParallelGC)" prio=10 tid=0x082ca000 nid=0x57df
>>>> runnable
>>>>
>>>> "GC task thread#1 (ParallelGC)" prio=10 tid=0x082cb400 nid=0x57e0
>>>> runnable
>>>>
>>>> "VM Periodic Task Thread" prio=10 tid=0x7fd17c00 nid=0x57e8 waiting
>>>> on condition
>>>>
>>>> JNI global references: 1199
>>>>
>>>> Has anyone notices such behavior?
>>>>
>>>> Thanks,
>>>> Sahoo
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>