dev@shoal.java.net

About failure recovery

From: Bongjae Chang <carryel_at_korea.com>
Date: Fri, 30 Jan 2009 21:16:24 +0900

Hi.

I have a problem about failure recovery.

When a member is failed, other member who has FailureRecoveryActionFactory can recover the failure member.

But I found a limitation in order to recovery the failure member correctly.

The restriction is that all members must have FailureRecoveryActionFactory.

Assume that "A", "B", "C" and "D" are members in same group.

"A" is the failure member and both "B" and "C" have FailureRecoveryFactory and "D" doesn't have FailureRecoveryFactory.

When "A" is failed, "B" and "C" can only recover "A".

But if "D" is selected for recoverer in "B" and "C"'s recovery-selection-algorithm, anyone can't recover "A".

So I think that only members who have FailureRecoveryActionFactory are qualified for recoverer.

In other words, I think that "D" should be excluded in recover's candidate.

Unfortunately, current algorithm qualifies all members as recoverer if they are alive and CORE members.

Could this case be supported?

--
Bongjae Chang