users@tyrus.java.net

Re: I have a problem with a server that has multiple addresses (HA issue)

From: Pavel Bucek <pavel.bucek_at_oracle.com>
Date: Tue, 22 Nov 2016 15:41:17 +0100

Hi Bill,

then it seems more like fixing a bug of DNS server (or workarounding the
existence of the cache).

Anyway, I can see that having similar functionality in Tyrus itself can
be beneficial - feel free to file an enhancement request [1] and
optionally, you can even contribute this (if you sign Oracle Contributor
Agreement [2]). Implementing this can be relatively simple, you can
extend existing client container (see existing ones - JDK and grizzly based)

Since you have this setup already somewhere - if you can run more tests
there, can you test how HttpUrlConnection behaves in this situation?

Regards,
Pavel

[1] https://java.net/jira/browse/TYRUS
[2] http://www.oracle.com/technetwork/community/oca-486395.html


On 20/11/2016 19:50, Bill Mair wrote:
> Hi Pavel,
>
> my problem is in the JDK Client Connector
> (https://java.net/projects/tyrus/sources/code/content/containers/jdk-client/src/main/java/org/glassfish/tyrus/container/jdk/client/JdkClientContainer.java?rev=d86f5679c524ae65ba9254306c413acce1e51c85)
>
> The getSocketAddress method only delivers a single SocketAddress and if
> the first IP address returned is the server that is unavailable then the
> client is completely incapable of connecting to the alternate address.
>
> Windows appears to cache the query for the TTL duration and always
> delivers the addresses in the same order, even if the DNS is set up to
> round robin.
>
> So connectSynchronously always fails when the first address in the
> possible addresses for a server can't be reached.
>
> In my opinion, the client should be using
> InetAddress#getAllByName(String host) to get multiple addresses (if
> defined) and attempt to connect to them all in the returned order.
>
> One could argue that the SocketChannel in TransportFilter#handleConnect
> should try all the available addresses but that is obviously not the case.
>
> Maybe the completion handler could use an array of the available addresses.
>
> --
>
> Bill
>
>
> On 20/11/16 19:23, Pavel Bucek wrote:
>> Hi Bill,
>>
>> what exactly is the issue about and how is it related to Tyrus (client)?
>>
>> Tyrus cannot influence name resolution and you can try to connect
>> until connected, simply by implementing it by yourself or using Tyrus
>> feature (see [1]).
>>
>> If you'd ask me how to do the scenario you specified, I'd recommend
>> putting some kind of LB in front of both servers. Then the client
>> (websocket or anything else) would connect to the LB, which will then
>> redirect to the live instance. This has obvious benefits (defined load
>> balancing, monitoring, ...), but requires additional machine.
>>
>> Regards,
>> Pavel
>>
>> [1]
>> https://tyrus.java.net/apidocs/1.13/org/glassfish/tyrus/client/ClientManager.ReconnectHandler.html
>>
>>
>> On 20/11/2016 14:51, Bill Mair wrote:
>>> Hi,
>>>
>>> in a feasibility test I set up a server name with 2 IP addresses
>>> (machines) associated with it.
>>>
>>> This is to support high availability and allow one of the servers to
>>> be off line for maintenance or failure and make sure that the service
>>> is still available through the second instance.
>>>
>>> The connection repeatedly fails if the first SocketAddress being
>>> returned by Windows DNS client is not available and no attempt to
>>> connect to the other server being made.
>>>
>>> In a HA production environment I think that it is essential that
>>> tyrus should attempt to connect to all IP addresses associated with a
>>> server's name.
>>>
>>> Any advice would be appreciated.
>>>
>>> Bill Mair
>>>
>