Bongjae Chang wrote:
> Hi,
> When I tested FailureRecovery with GroupLeadershipNotification, I
> found sometimes the signal was not notified.
<deleted>
> As you see above, sometimes the view(previous or current view) could
> be shared and cleared unexpectedly by other components.
I agree with your assessment of the issue. Definitely a nasty concurrent
issue that is not easily diagnosed.
> So I attached proposed patches which copy and return the list safely.
I agree that result retured by getCurrentView() and getPreviousView()
are shared and need to be protected from modification to prevent
this same issue from occurring in future.
I have attached an alternative proposed change to ViewWindowImpl.java
that ensures that all future manipulations of a view snapshot are
read-only. After the view snapshot is constructed, all future
manipulations are read-only and this change enforces that.
> And I removed "List.clear()" code in
> GroupLeadershipNotificationSignalImpl.java because I think the logic
> is unnecessary.
+1
> When I applied the patch, I found it worked well,
> Please review the attached patches.
Thanks for finding and fixing this issue.
If you agree with the attached proposal to prevent future modifications
to view snapshots,
please commit your change GroupLeaderShipNotificationSignalImpl and the
attached changes (inspired by your findings)
to both the branch and trunk.
For my testing of the changes, I ran runsimulatecluster.sh developer
test and killed the master after making the above changes and did not
observe any failures.
-Joe
> Thanks.
>
> Regards,
> Bongjae Chang
> ------------------------------------------------------------------------
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe_at_shoal.dev.java.net
> For additional commands, e-mail: dev-help_at_shoal.dev.java.net
>