dev@glassfish.java.net

Re: Another weird admin (?) problem

From: Scott Fordin <scott.fordin_at_oracle.com>
Date: Tue, 25 Jan 2011 11:49:49 -0500

Please be sure to let me know if there's a workaround you want me to add to the Release Notes.

Thanks,

Scott

"Tom Mueller" <tom.mueller_at_oracle.com> wrote:

>Ken,
>
>I'm not quite sure what code is executing where in your description
>below. However, since handleSignal is for a GMS event, I assume that
>it
>is being called in one instance (say in1) when another instance is
>started (say testInstance1). And the problem is that the code running
>in
>in1 is not able to get the port value that testInstance1 is using.
>
>The reason for this is that there is a bug in the code that replicates
>the system property information for instances. As long as no
>--systemproperties option is provided on the create-instance or
>create-local-instance command, then the replication works fine.
>However,
>if there is a --systemproperties option, then only that system property
>
>is replicated to the other instances. A restart of the instance that
>does a resync will bring over the rest of the properties.
>
>I created issue 15683 for this:
>http://java.net/jira/browse/GLASSFISH-15683
>
>As with the other problem with system properties, the work-around is
>again to not use the --systemproperties option on the create-instance
>command.
>
>Is 15683 a stopper for 3.1?
>
>Tom
>
>
>On 1/24/2011 5:29 PM, Ken wrote:
>> This is the same setup as in issue 15665.
>> I have a 5 instance cluster (instances in0-in4, IIOP listener ports
>> 9037-13037 at intervals of 1000).
>> I have shutdown the cluster, then re-started in1, in2, and in4.
>> Then I am creating testInstance0 (IIOP listener should be 20037) and
>> testInstance1
>> (IIOP listener 21037).
>>
>> I'm using create-instance to create a new instance in a running
>cluster:
>>
>> Command: create-instance --node apolloNA --systemproperties
>> instance_name=testInstance1 --cluster c1 --portbase 21000
>> --checkports=true testInstance1
>>
>> which results in:
>>
>> Using DAS host minas and port 4848 from existing das.properties
>> for node
>> apolloNA. To use a different DAS, create a new node using
>> create-node-ssh or
>> create-node-config. Create the instance with the new node and
>correct
>> host and port:
>> asadmin --host das_host --port das_port create-local-instance
>> --node node_name instance_name.
>> Command _create-instance-filesystem executed successfully.
>> Port Assignments for server instance testInstance1:
>> JMX_SYSTEM_CONNECTOR_PORT=21086
>> JMS_PROVIDER_PORT=21076
>> HTTP_LISTENER_PORT=21080
>> ASADMIN_LISTENER_PORT=21048
>> JAVA_DEBUGGER_PORT=21009
>> IIOP_SSL_LISTENER_PORT=21038
>> IIOP_LISTENER_PORT=21037
>> OSGI_SHELL_TELNET_PORT=21066
>> HTTP_SSL_LISTENER_PORT=21081
>> IIOP_SSL_MUTUALAUTH_PORT=21039
>> The instance, testInstance1, was created on host apollo
>> WARNING: Instance in0 seems to be offline; command
>> _register-instance-at-instance was not replicated to that instance
>> WARNING: Instance in3 seems to be offline; command
>> _register-instance-at-instance was not replicated to that instance
>> Command create-instance executed successfully.
>>
>> The domain.xml contents after the create-instance command completes
>> look fine:
>>
>> <servers>
>> <server name="server" config-ref="server-config">
>> <resource-ref ref="jdbc/__TimerPool"></resource-ref>
>> <resource-ref ref="jdbc/__default"></resource-ref>
>> </server>
>>
>> (instances similar to testInstance0 omitted here)
>>
>> <server name="testInstance0" node-ref="apolloNA"
>config-ref="c1-config">
>> <system-property name="instance_name"
>> value="testInstance0"></system-property>
>> <system-property name="ASADMIN_LISTENER_PORT"
>> value="20048"></system-property>
>> <system-property name="HTTP_LISTENER_PORT"
>> value="20080"></system-property>
>> <system-property name="HTTP_SSL_LISTENER_PORT"
>> value="20081"></system-property>
>> <system-property name="IIOP_LISTENER_PORT"
>> value="20037"></system-property>
>> <system-property name="IIOP_SSL_MUTUALAUTH_PORT"
>> value="20039"></system-property>
>> <system-property name="IIOP_SSL_LISTENER_PORT"
>> value="20038"></system-property>
>> <system-property name="JMS_PROVIDER_PORT"
>> value="20076"></system-property>
>> <system-property name="JMX_SYSTEM_CONNECTOR_PORT"
>> value="20086"></system-property>
>> <system-property name="OSGI_SHELL_TELNET_PORT"
>> value="20066"></system-property>
>> <system-property name="JAVA_DEBUGGER_PORT"
>> value="20009"></system-property>
>> <application-ref ref="TestEJB"
>> virtual-servers="server"></application-ref>
>> </server>
>>
>> which is very similar to all of the other server entries for the
>> existing in0-in4 instances.
>> The other configs contain related elements (I only care about
>> orb-listener-1 for FOLB):
>>
>> server-config:
>>
>> <iiop-service>
>> <orb use-thread-pool-ids="thread-pool-1"></orb>
>> <iiop-listener port="3700" id="orb-listener-1" address="0.0.0.0"
>> lazy-init="true"></iiop-listener>
>> <iiop-listener port="3820" id="SSL" address="0.0.0.0"
>> security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as"></ssl>
>> </iiop-listener>
>> <iiop-listener port="3920" id="SSL_MUTUALAUTH" address="0.0.0.0"
>> security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as" client-auth-enabled="true"></ssl>
>> </iiop-listener>
>> </iiop-service>
>>
>> default-config:
>>
>> <iiop-service>
>> <orb use-thread-pool-ids="thread-pool-1"></orb>
>> <iiop-listener port="${IIOP_LISTENER_PORT}" id="orb-listener-1"
>> address="0.0.0.0"></iiop-listener>
>> <iiop-listener port="${IIOP_SSL_LISTENER_PORT}" id="SSL"
>> address="0.0.0.0" security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as"></ssl>
>> </iiop-listener>
>> <iiop-listener port="${IIOP_SSL_MUTUALAUTH_PORT}" id="SSL_MUTUALAUTH"
>
>> address="0.0.0.0" security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as" client-auth-enabled="true"></ssl>
>> </iiop-listener>
>> </iiop-service>
>>
>> <system-property name="IIOP_LISTENER_PORT"
>> value="23700"></system-property>
>>
>> c1-config:
>>
>> <iiop-service>
>> <orb use-thread-pool-ids="thread-pool-1"></orb>
>> <iiop-listener id="orb-listener-1" port="${IIOP_LISTENER_PORT}"
>> address="0.0.0.0"></iiop-listener>
>> <iiop-listener id="SSL" port="${IIOP_SSL_LISTENER_PORT}"
>> address="0.0.0.0" security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as"></ssl>
>> </iiop-listener>
>> <iiop-listener id="SSL_MUTUALAUTH" port="${IIOP_SSL_MUTUALAUTH_PORT}"
>
>> address="0.0.0.0" security-enabled="true">
>> <ssl classname="com.sun.enterprise.security.ssl.GlassfishSSLImpl"
>> cert-nickname="s1as" client-auth-enabled="true"></ssl>
>> </iiop-listener>
>> </iiop-service>
>>
>>
>> <system-property name="IIOP_LISTENER_PORT"
>> value="23700"></system-property>
>>
>> My problem is in PropertyResolver.getPropertyValue.
>> It checks the server config (I think that's the one that should have
>> the correct value),
>> and seems to find the IIOP_LISTENER_PORT property in the props with a
>
>> null value.
>> (By the way, we really need the config beans to have a generated
>> useful toString() method.
>> I can't easily log or debug this code, because I can't tell WHICH
>bean
>> I have at hand
>> until I start pulling things out of it).
>>
>> The cluster props are empty.
>>
>> The config props returns the 23700 value. This happens for both
>> testInstance0 and testInstance1.
>> I can tell from the request distribution that proper loadbalancing is
>
>> happening, but all
>> calls to testInstance0 and testInstance1 failover to in2, because the
>
>> port value is wrong.
>>
>> This apparently works fine if an existing instance is re-started.
>The
>> failure only
>> occurs the first time I try to read the config after the instance has
>
>> been created.
>> I read the config from 3 other instances (in2, in1, and in4).
>>
>> The code path on my side is in
>>
>orb/orb-iiop/src/main/java/org/glassfish/enterprise/iiop/impl/IiopFolbGmsClient.
>> The GMS event is handled in handleSignal, which calls addMember.
>> addMember calls getClusterInstanceInfo(String),
>> which proceeds -> getClusterInstanceInfo(Server, Config, boolean) ->
>> resolvePort -> PropertyResolver.getPropertyValue.
>> While there certainly could be an error in my code, I'm wondering if
>> there is an admin problem here,
>> because the same code works fine in all other cases of nodes stopping
>
>> and starting.
>>
>> I've also attached the complete domain.xml from my test setup. The
>> DAS is minas, and all instances run on apollo.
>>
>> Thanks,
>>
>> Ken.
>>
>>


--
Scott Fordin | Principal Technical Writer
Oracle GlassFish & Fusion Middleware
8 Van De Graaff | Burlington, MA 01803
Phone: +1.781.442.2021 | Mobile: +1.603.459.3836
Oracle | http://www.oracle.com