admin@glassfish.java.net

asadmin vs instance1 (Re: Schrödinger's instance)

From: Bobby Bissett <bobby.bissett_at_oracle.com>
Date: Wed, 23 Jun 2010 11:29:26 -0400

On Jun 21, 2010, at 8:40 PM, Byron Nevins wrote:
> In that case, a Quantum Mechanical analysis is in order after all...

Ok, let me try to simplify this as much as possible, but no further
(yeah, I know I'm mixing up my physicists). The port in use issue may
have been instance1 still running since asadmin couldn't shut it down
and I didn't notice. Let's forget about that for now. So here goes:

After creating cluster and two instances, I have the das running.
Everyone is happy:

--- begin ---
hostname% $GF_HOME/bin/asadmin list-instances
instance1 not running
instance2 not running
Command list-instances executed successfully.
--- end ---

Then I start instance 1:

--- begin ---
hostname% $GF_HOME/bin/asadmin start-instance instance1
Command start-instance executed successfully.
--- end ---

At this point I see instance1 running in 'ps' and this in the DAS log:

[#|2010-06-23T11:01:28.220-0400|INFO|glassfish3.1|null|
_ThreadID=34;_ThreadName=Thread-1;|Successfully started the instance:
instance1|#]

I also see the join notifications in the gms logging in the das. By my
observations instance1 is running. But list-instances doesn't see it:

--- begin ---
hostname% $GF_HOME/bin/asadmin list-instances
instance1 not running
instance2 not running
Command list-instances executed successfully.
--- end ---

Now I start instance 2 and list again:

--- begin ---
hostname% $GF_HOME/bin/asadmin start-instance instance2
Command start-instance executed successfully.

hostname% $GF_HOME/bin/asadmin list-instances
instance1 not running
instance2 running
Command list-instances executed successfully.
--- end ---

So what's up with the "instance1 uncertainty principle" (continuing
the theme)? Note, the setup commands are all the same as what's in the
script I attached a couple emails ago. There is a
"java.net.BindException: Address already in use" stack trace in the
instance1 log (attached), but the same thing is in the instance2 log
as well. I don't see any other errors and I see that instance1 is
happily up and participating in the group management service.

As you can imagine, I can't stop instance1 with asadmin in this state
but instead have to kill the process.

Thanks,
Bobby