admin@glassfish.java.net

Re: Should UnknownHostException in synchronization cause instance to not come up?

From: Tom Mueller <tom.mueller_at_oracle.com>
Date: Tue, 12 Oct 2010 09:32:41 -0500

  I'd like to suggest that we try to get the instance up in the face of
any failure. But we need to make it clear that something went wrong.

For example, if an instance fails to synchronize, comes up, and then
later is accessible by the DAS (say the network cable was plugged back
in), will the "list-instances" command report that the instance needs to
be restarted? Will it not replicate any commands to that instance until
a sync does occur?

Tom


On 10/11/2010 7:34 PM, Joe Di Pol wrote:
> Bill Shannon wrote:
>> The real answer is that I didn't evaluate all possible exceptions to
>> determine which ones should be considered transient failures.
>>
>> Do you think UnknownHostException is more likely a transient error?
>
> I'm not sure.
>
> In theory the hostname in das.properties was valid when the instance
> was created -- since we verify it (create-local-instance must be
> able to talk to the DAS to create the instance and
> _create-instance-filesystem does a check too for the SSH case).
>
> So either the file has gotten corrupted, or there is a name service
> outage, or the name service data has changed. I guess only one of
> those is truly transient.
>
> I'm not sure what the right answer is, but it did surprise me. I figured
> if the instance can't talk to the DAS -- no matter the reason -- it
> would still come up.
>
> Joe
>
> P.S. I'm hitting this on the Mac where for some reason a restarted
> instance can't resolve the DAS hostname -- even though the originally
> started in stance can!?!
>
>
>>
>>
>> Joe Di Pol wrote on 10/11/10 03:52 PM:
>>>
>>> While looking into a different problem I noticed that
>>> if an instance gets an UnknownHostException while attempting
>>> to synchronize with the DAS it does not come up. If it
>>> gets a ConnectionException (like when the DAS is down) it
>>> does come up.
>>>
>>> Is there a reason why the former case does not behave like the later?
>>> Is it because we think UnknownHostException is more likely
>>> to be a configuration error than a transient name service problem?
>>>
>>> Joe
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: admin-unsubscribe_at_glassfish.dev.java.net
>>> For additional commands, e-mail: admin-help_at_glassfish.dev.java.net
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: admin-unsubscribe_at_glassfish.dev.java.net
>> For additional commands, e-mail: admin-help_at_glassfish.dev.java.net
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: admin-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: admin-help_at_glassfish.dev.java.net
>