dev@glassfish.java.net

Re: Server is up, but asadmin restart-domain times out

From: Jane Young <jane.young_at_oracle.com>
Date: Tue, 29 Jun 2010 23:29:03 -0700

Hi Ming,

Please commit the fix in QL. Byron has fix the issue.

Thanks,
Jane


Jane Young wrote:
> Thanks for fixing QL.
> Let's wait until issue is resolved.
>
> Thanks,
> Jane
>
>
> Ming Zhang wrote:
>> Hi Jane,
>>
>> I have transferred the restart-server targets to restartDomainTest in
>> QL. But if I integrate the test now, QL will fail on it and hudson
>> continuous build will be red. Should I wait until the issue 12420 to
>> be resolved? Please let me know.
>>
>> Thanks,
>> Ming
>>
>> On 6/29/2010 2:49 PM, Ming Zhang wrote:
>>> I have filed issue 12420 for the problem related to "asadmin
>>> restart-domain" command.
>>>
>>> The current "restart-server-unix" or "restart-server-windows"
>>> targets in QL are not tests since they don't report status. They
>>> were checked in without my review. Next time, please let me know
>>> when anyone checks in targets to the top level build scripts since
>>> they affect the whole QL. Meanwhile, I'll try to create a test for
>>> restart-domain.
>>>
>>> Thanks,
>>> Ming
>>>
>>> On 6/29/2010 2:06 PM, Jane Young wrote:
>>>> Sahoo,
>>>>
>>>> Ming is looking at fixing the QL test to fail with the
>>>> restart-domain command.
>>>> If he commits the fix for QL, I will revert HK2 1.0.26 integration
>>>> since QL tests will start failing.
>>>>
>>>> Thanks,
>>>> Jane
>>>>
>>>>
>>>> Amy Roh wrote:
>>>>> I've seen this running QL also. Web devtests [1] fail ~50% due to
>>>>> failing to restart. However, when I check, the server is actually
>>>>> running.
>>>>>
>>>>> startDomainUnix:
>>>>> [echo] Starting DAS, ENABLE_REPLICATION=false
>>>>> [exec] Error starting domain: domain1. It didn't start in 600
>>>>> seconds
>>>>> [exec] Waiting for the server to start
>>>>> ......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>>>>>
>>>>> [exec] Command start-domain failed.
>>>>>
>>>>> [1] http://hudson.sfbay.sun.com/job/webtier-dev-tests-v3
>>>>>
>>>>> Sanjeeb Sahoo wrote:
>>>>>> This is interesting. My QL test failed to detect that server has
>>>>>> restarted fine. Given below is the QL output...
>>>>>>
>>>>>> restart-server-unix:
>>>>>> [echo] restarting server
>>>>>> [exec] Timed out waiting for the server to restart
>>>>>> [exec] Command restart-domain failed.
>>>>>> [exec] Result: 1
>>>>>> [exec] Waiting for the domain to stop .
>>>>>> [exec] Command stop-domain executed successfully.
>>>>>>
>>>>>>
>>>>>> While it was waiting for the server to restart, I ran a jps and
>>>>>> found the following Java processes running:
>>>>>>
>>>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jps
>>>>>> 23093 Jps
>>>>>> 20153 DerbyControl
>>>>>> 20033 Launcher
>>>>>> 22489 admin-cli.jar
>>>>>> 10312 Main
>>>>>> 22538 ASMain
>>>>>>
>>>>>> What surprised me was that the server was actually up. I could
>>>>>> load admin console and run admin commands. For some reason,
>>>>>> restart-domain failed to detect the same. I checked the pid file
>>>>>> in domain1/config/ and that contained the right value. jstack
>>>>>> output for admin-cli.jar is shown below:
>>>>>>
>>>>>> ss141213_at_Sahoo:/space/ss141213/WS/gf/v3$ jstack 22489
>>>>>> 2010-06-30 01:19:55
>>>>>> Full thread dump Java HotSpot(TM) Server VM (14.2-b01 mixed mode):
>>>>>>
>>>>>> "Attach Listener" daemon prio=10 tid=0x085d1c00 nid=0x5a57
>>>>>> waiting on condition [0x00000000]
>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>
>>>>>> "Low Memory Detector" daemon prio=10 tid=0x7fd15c00 nid=0x57e7
>>>>>> runnable [0x00000000]
>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>
>>>>>> "CompilerThread1" daemon prio=10 tid=0x7fd13800 nid=0x57e6
>>>>>> waiting on condition [0x00000000]
>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>
>>>>>> "CompilerThread0" daemon prio=10 tid=0x7fd12000 nid=0x57e5
>>>>>> waiting on condition [0x00000000]
>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>
>>>>>> "Signal Dispatcher" daemon prio=10 tid=0x7fd10800 nid=0x57e4
>>>>>> runnable [0x00000000]
>>>>>> java.lang.Thread.State: RUNNABLE
>>>>>>
>>>>>> "Finalizer" daemon prio=10 tid=0x7fd00800 nid=0x57e3 in
>>>>>> Object.wait() [0x7fe96000]
>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>> at java.lang.Object.wait(Native Method)
>>>>>> - waiting on <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>>>>>> - locked <0x845b4780> (a java.lang.ref.ReferenceQueue$Lock)
>>>>>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>>>>>> at
>>>>>> java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>>>>>>
>>>>>> "Reference Handler" daemon prio=10 tid=0x08343400 nid=0x57e2 in
>>>>>> Object.wait() [0x7fee7000]
>>>>>> java.lang.Thread.State: WAITING (on object monitor)
>>>>>> at java.lang.Object.wait(Native Method)
>>>>>> - waiting on <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>>>> at java.lang.Object.wait(Object.java:485)
>>>>>> at
>>>>>> java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>>>>>> - locked <0x845b4808> (a java.lang.ref.Reference$Lock)
>>>>>>
>>>>>> "main" prio=10 tid=0x082c3000 nid=0x57de waiting on condition
>>>>>> [0xb6aea000]
>>>>>> java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>>> at java.lang.Thread.sleep(Native Method)
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.LocalServerCommand.waitForRestart(LocalServerCommand.java:307)
>>>>>>
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.RestartDomainCommand.doCommand(RestartDomainCommand.java:87)
>>>>>>
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.StopDomainCommand.executeCommand(StopDomainCommand.java:130)
>>>>>>
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.CLICommand.execute(CLICommand.java:255)
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.AsadminMain.executeCommand(AsadminMain.java:229)
>>>>>>
>>>>>> at
>>>>>> com.sun.enterprise.admin.cli.AsadminMain.main(AsadminMain.java:167)
>>>>>>
>>>>>> "VM Thread" prio=10 tid=0x0833f400 nid=0x57e1 runnable
>>>>>>
>>>>>> "GC task thread#0 (ParallelGC)" prio=10 tid=0x082ca000 nid=0x57df
>>>>>> runnable
>>>>>>
>>>>>> "GC task thread#1 (ParallelGC)" prio=10 tid=0x082cb400 nid=0x57e0
>>>>>> runnable
>>>>>>
>>>>>> "VM Periodic Task Thread" prio=10 tid=0x7fd17c00 nid=0x57e8
>>>>>> waiting on condition
>>>>>>
>>>>>> JNI global references: 1199
>>>>>>
>>>>>> Has anyone notices such behavior?
>>>>>>
>>>>>> Thanks,
>>>>>> Sahoo
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
>> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>>
>
>