dev@glassfish.java.net

Re: deployment devtests hang

From: Byron Nevins <byron.nevins_at_oracle.com>
Date: Fri, 19 Nov 2010 23:20:45 -0800

Could Murphy's Law be involved?


On 11/19/2010 7:42 PM, Bill Shannon wrote:
> With some debugging tips from Tim, I was able to track down the root
> cause of this problem. Some of you might find this amusing...
>
> I was having trouble getting the tests to run with my changes, so I had
> switched back to using an unchanged build of GlassFish. I have a cron
> job that does an "svn update" and build every night, so I just used that
> build.
>
> Well, it turns out my svn update was failing for some reason, perhaps
> because the certificate had expired:
>
> + svn update
> Error validating server certificate for 'https://svn.dev.java.net:443':
> - The certificate is not issued by a trusted authority. Use the
> fingerprint to validate the certificate manually!
> Certificate information:
> - Hostname: *.dev.java.net
> - Valid: from Sep 13 09:52:17 2010 GMT until Nov 15 14:38:09 2011 GMT
> - Issuer: Equifax Secure Certificate Authority, Equifax, US
> - Fingerprint:
> a3:69:1e:c2:66:f7:5b:3f:b5:63:da:5e:9e:09:cf:70:dd:51:b8:7b
> (R)eject, accept (t)emporarily or accept (p)ermanently? svn: PROPFIND
> request fa
> iled on '/svn/glassfish-svn/trunk/v3'
> svn: PROPFIND of '/svn/glassfish-svn/trunk/v3': Server certificate
> verification
> failed: issuer is not trusted (https://svn.dev.java.net)
>
> Ok, so my build was a few days old. No big deal, right?
>
> The build I was using included this change to
> NetUtils.getCanonicalHostName
> from Byron:
>
> r41933 | bnevins | 2010-10-21 11:32:18 -0700 (Thu, 21 Oct 2010) | 4 lines
>
> Made the obnoxiously long wait for getCanonicalHostname 3 seconds. If
> your DNS
> is munged up - it would take ~~30 seconds.
> If it takes more than 3 seconds - then it is not attempted in the
> future...
>
>
> That change caused getCanonicalHostName to use an ExecutorService, which
> created threads that weren't daemon threads, and thus prevented app
> clients
> and standalone clients from exiting when they were done.
>
> Argh!
>
> Fortunately, Byron came to his senses and removed this code in this
> commit
> a day later:
>
> r41991 | bnevins | 2010-10-22 12:38:14 -0700 (Fri, 22 Oct 2010) | 5 lines
>
> Totally weird. A thread was left running with yesterday's changes.
> Changed the technique. If you want to skip reverse-dns lookup then
> set this env
> . variable to true
> AS_NO_REVERSE_DNS
>
>
> Lucky me, my svn update started failing right in that window between
> those
> two changes! (Why? I don't know. The cert problem shouldn't have
> started
> then.)
>
> Argh! Argh! Argh!
>
>
>
> Bill Shannon wrote on 11/18/2010 02:22 PM:
>> I'm running the deployment devtests on my OpenSolaris x86 machine
>> using an unmodified 3.1 build from last night. I've tried several
>> build, they all hang.
>>
>> The tests hang running the statelesshello test. The ant output stops
>> here:
>>
>> runclient:
>> [javac]
>> /home/shannon/appserv-tests/devtests/deployment/ejb/statelesshello/build.xml:140:
>> warning:
>> 'includeantruntime' was not set, defaulting to
>> build.sysclasspath=last; set to
>> false for repeatable builds
>> [javac] Compiling 1 source file to
>> /home/shannon/appserv-tests/devtests/deployment/build/ejb/statelesshello
>> [javaWithResult] Test expects successful result
>> [javaWithResult] Nov 18, 2010 11:26:34 AM
>> com.sun.logging.LogDomains$1 log
>> [javaWithResult] INFO: Cannot find javadb client jar file, derby jdbc
>> driver
>> will not be available by default.
>> [javaWithResult] Nov 18, 2010 11:26:34 AM
>> com.sun.logging.LogDomains$1 log
>> [javaWithResult] INFO: enterprise_used_delegate_name
>> [javaWithResult] Looked up home!!
>> [javaWithResult] Narrowed home!!
>> [javaWithResult] Got the EJB!!
>> [javaWithResult] Statelesshello EJB - checked environment properties
>> for user scott
>> [javaWithResult] and for number which is = 22.25
>> [javaWithResult]
>> [javaWithResult] Client's sayStatelesshello() method succeeded
>> [javaWithResult]
>> [javaWithResult] Client now getting a User Defined Exception
>> [javaWithResult] Success! Caught StatelesshelloException
>> [javaWithResult] Client's getUserDefinedException() method succeeded
>> [javaWithResult]
>> [javaWithResult] Client is now trying to remove the session bean
>> [javaWithResult]
>> [javaWithResult] Session bean was successfully removed by the client.
>> [javaWithResult]
>>
>> The server.log file ends with this:
>>
>> [#|2010-11-18T11:26:38.758-0800|INFO|glassfish3.1|null|_ThreadID=16;_ThreadName=Thread-1;|Now
>>
>> going forward with reading the environment properties
>> |#]
>>
>> [#|2010-11-18T11:26:38.766-0800|INFO|glassfish3.1|null|_ThreadID=16;_ThreadName=Thread-1;|Statelesshello
>>
>> EJB - checked environment properties for user scott
>> and for number which is = 22.25
>> |#]
>>
>> [#|2010-11-18T11:26:38.778-0800|INFO|glassfish3.1|null|_ThreadID=16;_ThreadName=Thread-1;|Throwing
>>
>> a User Defined Exception|#]
>>
>>
>> The client seems to be hung. jstack shows this:
>>
>> 2010-11-18 14:09:48
>> Full thread dump Java HotSpot(TM) Client VM (17.0-b16 mixed mode):
>>
>> "Attach Listener" daemon prio=3 tid=0x08299000 nid=0x11 waiting on
>> condition
>> [0x00000000]
>> java.lang.Thread.State: RUNNABLE
>>
>> "DestroyJavaVM" prio=3 tid=0x0807d000 nid=0x2 waiting on condition
>> [0x00000000]
>> java.lang.Thread.State: RUNNABLE
>>
>> "SelectorThread" daemon prio=3 tid=0x08682000 nid=0xc runnable
>> [0xe6c23000]
>> java.lang.Thread.State: RUNNABLE
>> at sun.nio.ch.DevPollArrayWrapper.poll0(Native Method)
>> at sun.nio.ch.DevPollArrayWrapper.poll(DevPollArrayWrapper.java:170)
>> at sun.nio.ch.DevPollSelectorImpl.doSelect(DevPollSelectorImpl.java:68)
>> at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>> - locked <0xeed68338> (a sun.nio.ch.Util$1)
>> - locked <0xeed68348> (a java.util.Collections$UnmodifiableSet)
>> - locked <0xeed682f0> (a sun.nio.ch.DevPollSelectorImpl)
>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>> at
>> com.sun.corba.ee.impl.transport.SelectorImpl.run(SelectorImpl.java:307)
>>
>> "transaction-manager" daemon prio=3 tid=0x09b56800 nid=0xb in
>> Object.wait()
>> [0xe6ef9000]
>> java.lang.Thread.State: WAITING (on object monitor)
>> at java.lang.Object.wait(Native Method)
>> - waiting on <0xee1845f8> (a java.util.TaskQueue)
>> at java.lang.Object.wait(Object.java:485)
>> at java.util.TimerThread.mainLoop(Timer.java:483)
>> - locked <0xee1845f8> (a java.util.TaskQueue)
>> at java.util.TimerThread.run(Timer.java:462)
>>
>> "pool-1-thread-1" prio=3 tid=0x09b30000 nid=0xa waiting on condition
>> [0xe6f4e000]
>> java.lang.Thread.State: WAITING (parking)
>> at sun.misc.Unsafe.park(Native Method)
>> - parking to wait for <0xee184670> (a
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>> at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
>> at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987)
>>
>>
>> at
>> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
>> at
>> java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
>> at java.lang.Thread.run(Thread.java:619)
>>
>> "Low Memory Detector" daemon prio=3 tid=0x08115400 nid=0x8 runnable
>> [0x00000000]
>> java.lang.Thread.State: RUNNABLE
>>
>> "CompilerThread0" daemon prio=3 tid=0x08112000 nid=0x7 waiting on
>> condition
>> [0x00000000]
>> java.lang.Thread.State: RUNNABLE
>>
>> "Signal Dispatcher" daemon prio=3 tid=0x08110c00 nid=0x6 runnable
>> [0x00000000]
>> java.lang.Thread.State: RUNNABLE
>>
>> "Finalizer" daemon prio=3 tid=0x080fbc00 nid=0x5 in Object.wait()
>> [0xfc262000]
>> java.lang.Thread.State: WAITING (on object monitor)
>> at java.lang.Object.wait(Native Method)
>> - waiting on <0xed7802b8> (a java.lang.ref.ReferenceQueue$Lock)
>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>> - locked <0xed7802b8> (a java.lang.ref.ReferenceQueue$Lock)
>> at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>> at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>>
>> "Reference Handler" daemon prio=3 tid=0x080fa400 nid=0x4 in
>> Object.wait()
>> [0xfe46e000]
>> java.lang.Thread.State: WAITING (on object monitor)
>> at java.lang.Object.wait(Native Method)
>> - waiting on <0xed780340> (a java.lang.ref.Reference$Lock)
>> at java.lang.Object.wait(Object.java:485)
>> at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>> - locked <0xed780340> (a java.lang.ref.Reference$Lock)
>>
>> "VM Thread" prio=3 tid=0x080f7c00 nid=0x3 runnable
>>
>> "VM Periodic Task Thread" prio=3 tid=0x08117c00 nid=0x9 waiting on
>> condition
>>
>> JNI global references: 1535
>>
>>
>> None of the threads in the client are the "main" thread, nor is it
>> obvious that any of the threads are waiting for something from the
>> server.
>>
>> The server is still running and responds to other requests.
>>
>> These same tests work fine on my MacOS machine.
>>
>> Any idea what's going on here?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: dev-help_at_glassfish.dev.java.net
>

-- 
Byron Nevins  -  Oracle Corporation
Home: 650-359-1290
Cell: 650-784-4123
Sierra: 209-295-2188