dev@glassfish.java.net

Re: repeated hudson problems with undeploy/shutdown

From: Jennifer Chou <jennifer.chou_at_oracle.com>
Date: Tue, 15 Jun 2010 13:40:34 +0100

I am going to check-in this now for Config config api. This should fix
the problem. I was seeing an NPE in I think
FlashlightProbeProviderFactory when monitoring-service is gone.
@Element(required=true)
    @NotNull
    MonitoringService getMonitoringService();

On 6/15/2010 1:37 PM, Hong Zhang wrote:
> This seems another case where we need to annotate the empty element
> with the @NotNull annotation?
>
> On 6/15/2010 7:39 AM, Jane Young wrote:
>> Looks like if the following elements are removed from domain.xml,
>> domain will not start:
>>
>> <config name="server-config">
>> ....
>>
>> <monitoring-service>
>>
>> <module-monitoring-levels
>> />
>>
>>
>> </monitoring-service>
>>
>> ....
>> </config>
>>
>> SmokeTests/CTS tests are failing due to this issue. This issue
>> started last Friday, June 11 evening in one of the following commits:
>> http://svnsearch.org/svnsearch/repos/GLASSFISH/search?from=37712&to=37715
>>
>> I downloaded the bundles from
>> http://hudson.glassfish.org/job/gf-trunk-build-continuous/4483/ and
>> SmokeTests passed with this build.
>> while it failed with this build:
>> http://hudson.glassfish.org/job/gf-trunk-build-continuous/4485/.
>>
>> Instructions to run smoketests:
>> http://wiki.glassfish.java.net/Wiki.jsp?page=V3ReleaseProcess (Step 8)
>>
>> Filed issue: https://glassfish.dev.java.net/issues/show_bug.cgi?id=12252
>> Can't promote this week's 3.1 build due to this issue. Also, nightly
>> distros are not published either.
>>
>> Jerome, can you please take a look?
>>
>> Thanks,
>> Jane
>>
>>
>> Jerome Dochez wrote:
>>> On 6/14/10 8:49 AM, Hong Zhang wrote:
>>>> When I was looking into the NPE (which is related to the
>>>> <system-applications> element), I noticed the create-cluster
>>>> command somehow removes the <system-applications> element from the
>>>> domain.xml.
>>>>
>>>> I installed a freshly built server, and the <system-applications>
>>>> element was there as expected. Then I set the environment variable
>>>> and started the domain, and created a cluster:
>>>> export ENABLE_REPLICATION=true
>>>> asadmin start-domain
>>>> asadmin create-cluster cluster1
>>>>
>>>> I noticed after the last command, the <system-applications> element
>>>> was gone from the domain.xml which caused the NPE in server shutdown.
>>> right this is expected, if the system-applications list is empty, it
>>> will not be written out.
>>>
>>> but didn't you add a @NotNull that should fix the problem ?
>>>
>>>>
>>>> I can add a check in the ApplicationLoaderService to avoid the NPE,
>>>> but someone should take a look at this as the <system-applications>
>>>> element need to be there in the domain.xml.
>>>>
>>>> Thanks,
>>>>
>>>> - Hong
>>>>
>>>> On 6/14/2010 11:15 AM, Hong Zhang wrote:
>>>>> Hi, Justin
>>>>> I had trouble with starting server since Friday night, I don't
>>>>> think it's related to the NPE you saw in the server.log as part of
>>>>> the server shutdown. That part of the code has not changed for a
>>>>> while, but I will fix the NPE.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> - Hong
>>>>>
>>>>>
>>>>> On 6/14/2010 11:01 AM, Justin Lee wrote:
>>>>>> I'm seeing repeated failures with this:
>>>>>>
>>>>>> startDomainUnix:
>>>>>> [echo] Starting DAS, ENABLE_REPLICATION=${ENABLE_REPLICATION}
>>>>>> [exec] Waiting for the server to start ........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>>>>>> [exec] Command start-domain failed.
>>>>>> [exec] Error starting domain: domain1. It didn't start in 600 seconds
>>>>>>
>>>>>>
>>>>>>
>>>>>> Looking in the logs, I see this exception:
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.477-0400|INFO|glassfish3.1|javax.enterprise.system.tools.admin.com.sun.enterprise.v3.admin|_ThreadID=92;_ThreadName=Thread-1;|Server shutdown initiated|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.509-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;|java.lang.NullPointerException|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.508-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at com.sun.enterprise.v3.server.ApplicationLoaderService.preDestroy(ApplicationLoaderService.java:404)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.508-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at com.sun.hk2.component.AbstractWombInhabitantImpl.dispose(AbstractWombInhabitantImpl.java:74)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.508-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at com.sun.hk2.component.SingletonInhabitant.release(SingletonInhabitant.java:66)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.508-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at com.sun.hk2.component.LazyInhabitant.release(LazyInhabitant.java:112)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.508-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at com.sun.enterprise.v3.server.AppServerStartup.stop(AppServerStartup.java:393)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.509-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at org.jvnet.hk2.osgiadapter.HK2Main$StartupContextService.updated(HK2Main.java:91)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.509-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at org.apache.felix.cm.impl.ConfigurationManager$DeleteConfiguration.run(ConfigurationManager.java:1582)|#]
>>>>>>
>>>>>> [#|2010-06-12T16:28:58.510-0400|SEVERE|glassfish3.1|null|_ThreadID=11;_ThreadName=Thread-1;| at org.apache.felix.cm.impl.UpdateThread.run(UpdateThread.java:88)|#]
>>>>>>
>>>>>>
>>>>>> At least 3 different jobs are failing right after the
>>>>>> jsp-caching-instance-level test. Anyone have any ideas about
>>>>>> that? The failing jobs are Grizzly_Integration,
>>>>>> webtier-dev-tests-v3, and webtier-dev-tests-v3-source. Others
>>>>>> might be as well, but I know for sure those are.
>>>>>>
>>>
>>