admin@glassfish.java.net

Re: Command Replication in 3.1 - details

From: Tim Quinn <tim.quinn_at_oracle.com>
Date: Fri, 30 Apr 2010 17:02:28 -0500

(also a reply to Bill's response)

I understand that a re-sync will be needed if a remote deployment has
failed.

I know that, currently, the only way to resync is to restart the
server. Has any thought been given to supporting resync while the
instance stays up? I know that a restart is a much more controlled
and ordered situation - things happen only the order we want - whereas
if the server is up at sync time then potentially apps could be
running that are being synced.

But this begins to sound a little like rolling upgrade of an app
without requiring a server restart, which I think is (or was at one
time) a goal for clustering in 3.x. So to support both use cases can
we think about designing resync at a granularity so we can resync a
given app on a given target? It seems like that would support rolling
upgrade without restart and the normal instance restart - as well as
the potential use case of an administrator resyncing just one app on
one instance (having fixed some error on a instance that caused a
problem during an earlier, failed remote deployment of an app to that
instance).

- Tim

On Apr 30, 2010, at 4:51 PM, Vijay Ramachandran wrote:

> On 4/30/10 1:24 PM, Tim Quinn wrote:
>> Hi, Vijay.
>>
>> On Apr 30, 2010, at 2:11 PM, Vijay Ramachandran wrote:
>>
>>> We are not assuming that the server is going to be up and that the
>>> communication is reliable. We will try to send the command to the
>>> instance and get back results. If sending the command itself
>>> failed, we will flag it as one type of error that indicates
>>> something wrong with network or the server. If the command went
>>> through to the server but the command execution fails, we will get
>>> a detailed error (just as it happens now between CLI-DAS), and we
>>> will display this error. In either of the failure cases, we
>>> indicate to the caller (the CLI or GUI), that the server(s) where
>>> the replication failed need to be restarted.
>>
>> If the server was up but the remote deploy command failed, why do
>> we expect restarting the server to fix the problem? We need to ask
>> the administrator to look at the problem and, perhaps, fix
>> something with the instance or the app. That's different from a
>> remote deployment failing because the DAS could not talk to the
>> instance.
>
> Whenever a remote command on some instance fails (because of network
> error or command execution failure or command getting timed out
> or ...), some level of manual involvement will be required where the
> admin has to see what went wrong with that instance. (As I had
> indicated in my earlier mail, we will be able to distinguish a
> connection/network related error from a command execution failure -
> hopefully it will be a good clue to the admin). Once that problem is
> solved, for the "newly repaired instance" to get the failed change,
> it has to synchronize itself with DAS which will happen with the
> instance restart only. Hence I mentioned that we will indicate that
> the server will require restart.
>
>> Reporting the error back to the admin client in such a case will
>> indeed be very helpful. Maybe this is in the "nice to have"
>> category, but it would also be really nice if the administrator
>> could go to the admin console and view the current status of each
>> app on each instance it is targeted for. If the remote deployment
>> returned an error then that error could be recorded as well as
>> returned to the client.
>>
>> This would be relatively easy to collect from the remote deployment
>> results; less easy for the DAS to get this after an instance
>> restart. That's why I expect it's a "nice to have" because I guess
>> the DAS would have to refresh that status for each app on an
>> instance that restarts.
>>
>> Just a thought...
>
> Thanks - I will take note of it as a P3 task in the task list.
>
> Vijay
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: admin-unsubscribe_at_glassfish.dev.java.net
> For additional commands, e-mail: admin-help_at_glassfish.dev.java.net
>