dev@glassfish.java.net

Re: Timeout when stop the domain with --force=false after deployed the ejb application

From: Tom Mueller <Tom.Mueller_at_oracle.com>
Date: Fri, 03 May 2013 10:28:54 -0500

Yes, Jeremy has pointed out the difference. What this really means is
that when force=false, we are depending on the GlassFish.stop() method
to cause all non-daemon threads to terminate. What is probably
happening is that there is some code somewhere that is creating a long
running thread, perhaps in a thread pool, which is not a daemon thread.

In the jstack.txt that Jeremy sent, the non-daemon thread is this one:

"Thread-30" prio=6 tid=0x34584800 nid=0xb28 in Object.wait() [0x33ddf000]
    java.lang.Thread.State: WAITING (on object monitor)
     at java.lang.Object.wait(Native Method)
     at java.lang.Object.wait(Object.java:503)
     at com.sun.corba.ee.impl.javax.rmi.CORBA.KeepAlive.run(Util.java:818)
     - locked <0x12130ba8> (a
com.sun.corba.ee.impl.javax.rmi.CORBA.KeepAlive)


If I just run start-domain on domain with no applications deployed, and
then run stop-domain --force=false, I'm seeing this non-daemon thread there:

"pool-9-thread-1" prio=5 tid=0x00007faf480ba000 nid=0xa203 waiting on
condition [0x00000001405d4000]
    java.lang.Thread.State: WAITING (parking)
         at sun.misc.Unsafe.park(Native Method)
         - parking to wait for <0x000000012dc504c0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
         at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
         at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
         at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
         at
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
         at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
         at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:722)

In some cases, this thread does exit before stop-domain times out, but
in some cases it does not.

I suspect that these are two different problems. The Hello.jar problem
is probably related to some service that the application is using that
is causing this CORBA thread to be created. The "pool-9-thread-1"
problem is something else. I've created the following issue for this
second problem:
https://java.net/jira/browse/GLASSFISH-20463

Jeremy, can you provide more information about what is in Hello.jar so
that we can isolate this problem. If you like, please file a bug on
this and include the details about Hello.jar there.

Thanks.

Tom

On 5/3/13 8:51 AM, 吕宋平 wrote:
> Hi, Hong:
> I found some differences about the logical between --force=true
> and --force=false as follows:
>
> StopServer.doExecute
>
> protected final void doExecute(ServiceLocator habitat,
> ServerEnvironment env, Logger logger, boolean force) {
>
> try {
>
> logger.info
> <http://logger.info>(localStrings.getLocalString("stop.domain.init",
> "Server shutdown initiated"));
>
> // Don't shutdown GlassFishRuntime, as that can bring the OSGi
> framework down which is wrong
>
> // when we are embedded inside an existing runtime. So, just stop the
> glassfish instance that
>
> // we are supposed to stop. Leave any cleanup to some other code.
>
>
> // get the GlassFish object - we have to wait in case startup is still
> in progress
>
> // This is a temporary work-around until HK2 supports waiting for the
> service to
>
> // show up in the ServiceLocator.
>
> GlassFish gfKernel = habitat.getService(GlassFish.class);
>
> while (gfKernel == null) {
>
> Thread.sleep(1000);
>
> gfKernel = habitat.getService(GlassFish.class);
>
> }
>
> // gfKernel is absolutely positively for-sure not null.
>
> gfKernel.stop();
>
> }
>
> catch (Throwable t) {
>
> // ignore
>
> }
>
>
>
> if(force)
>
> System.exit(0);
>
> else
>
> deletePidFile(env);
>
> }
>
>
> after we type as --force=true option, it will execute the
> the System.exit(0); and the server or cluster will stop as expected.
>
>
> Thanks
>
>
> Jeremy
>
>
>
>
> 2013/5/3 Hong Zhang <hong.hz.zhang_at_oracle.com
> <mailto:hong.hz.zhang_at_oracle.com>>
>
> Hi, Jeremy
> I could see the stop-domain command also hang for me when I
> used the option force=false. The stop-domain command executed
> successfully when I did not specify the force option.
>
> Tom: what's the difference between when --force option is false
> versus true? When could user specify the option value as false?
> Should they just always stick with the default "true" value?
>
> Thanks,
>
> - Hong
>
>
> On 5/3/2013 2:41 AM, lvsongping wrote:
>>
>> Hi, Hong, Marina:
>>
>> Cc: Tom, dev:
>>
>> I have found a strange situation that it will be timeout if I
>> stop the DAS or instance with –force=false after I have deployed
>> an ejb application. Here’s my reproduced steps:
>>
>> 1). asadmin start-domain
>>
>> 2). asadmin deploy Hello.jar
>>
>> Application deployed with name Hello.
>>
>> Command deploy executed successfully.
>>
>> 3). asadmin stop-domain –force=false
>>
>> Waiting for the domain to stop
>> .......................................
>>
>> Timed out (60 seconds) waiting for the domain to stop.
>>
>> Command stop-domain failed.
>>
>> 4).jstack jvm_pid > jstack.txt(I have attached the jstack file).
>>
>> 5). asadmin start-domain
>>
>> Waiting for domain1 to start .Error starting domain domain1.
>>
>> The server exited prematurely with exit code 1.
>>
>> Before it died, it produced the following output:
>>
>> FATAL ERROR in native method: JDWP No transports initialized,
>> jvmtiError=AGENT_E
>>
>> RROR_TRANSPORT_INIT(197)
>>
>> ERROR: transport error 202: bind failed: Address already in use
>>
>> ERROR: JDWP Transport dt_socket failed to initialize,
>> TRANSPORT_INIT(510)
>>
>> JDWP exit error AGENT_ERROR_TRANSPORT_INIT(197): No transports
>> initialized [../.
>>
>> ./../src/share/back/debugInit.c:750]
>>
>> Command start-domain failed.
>>
>> Then the domain can’t be start normally, I think somewhere must
>> lock the file because of deploy the ejb application. But I don’t
>> the exactly reason about this.
>>
>> BTW: <1>. The exception will not come out if we stop the domain
>> or cluster with default option of –force.
>>
>> <2>. The exception will not come out if stop the DAS and cluster
>> after only deployed the web application.
>>
>> Thanks
>>
>> -Jeremy
>>
>
>