users@glassfish.java.net

mod_jk and glassfish cluster

From: <forums_at_java.net>
Date: Tue, 26 Apr 2011 10:16:38 -0500 (CDT)

*          Hi all,*

          I setup a glassfish cluster successfully on RHEL5 version
3.1 b43 with two nodes each node having one instance. I tested mod_jk
versions 1.2.26/28.31 both prebuild and manually compiled. We plan to deploy
an application that will be receiving each hour small updates from 300000
clients. We deployed and sample http application showing the instance name to
make sure the loadbalancer works. I followed this post, but made changes to
the worker.properties as failover was not working:

http://tiainen.sertik.net/2011/03/load-balancing-with-glassfish-31-and.html
[1]

 

*jk.conf*

/LoadModule jk_module modules/mod_jk.so
JkWorkersFile /etc/httpd/worker.properties
JkShmFile /var/log/httpd/mod_jk.shm
JkLogFile /var/log/httpd/mod_jk.log
JkLogLevel info
JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
JkRequestLogFormat "%w %V %T"
# redirect traffic to loadbalancer
JkMount /* loadbalancer/
 

*worker.properties*

/worker.list=loadbalancer/

/# default properties for workers
worker.template.type=ajp13
worker.template.port=28009
worker.template.lbfactor=50
worker.template.connection_pool_timeout=600
worker.template.socket_keepalive=1
worker.template.socket_timeout=120/

/# properties for worker1
worker.worker1.reference=worker.template
worker.worker1.host=glassfishin01
#worker.worker1.lbfactor=50/

/# properties for worker2
worker.worker2.reference=worker.template
worker.worker2.host=glassfishin02
#worker.worker2.lbfactor=50/

/# properties for loadbalancer
worker.loadbalancer.type=lb
worker.loadbalancer.sticky_session=False
worker.loadbalancer.reply_timeout=30000
worker.loadbalancer.balance_workers=worker1,worker2/

 

What works:

1. The DAS and the Glassfish instances work as expected

2. The loadbalancing works just fine

3. Failover works ONLY if I stop or restart the instance from the DAS

The problem:

If I restart the OS of an instance, failover is damaged - instance is
detected as down and until the other instance is down things seem fine. When
the other instance boots up either of them is not working. Sometimes I have
to restart the cluster and the httpd to get things going. Some how mod_jk
makes difference between both types of failover. It wrongly detects one or
both intances as down.

 

/[Tue Apr 26 09:07:18 2011] [29639:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:18 2011] [29639:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:18 2011] [29639:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:18 2011] [29809:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:18 2011] [29809:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:18 2011] [29809:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker1) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.203:28009) is down (errno=104)
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker1) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker1) sending request to tomcat
failed (recoverable),  (attempt=2)
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [error]
ajp_service::jk_ajp_common.c (2204): (worker1) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [info] service::jk_lb_worker.c
(1168): service failed, worker worker1 is in error state
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [info] service::jk_lb_worker.c
(1245): All tomcat instances are busy or in error state
[Tue Apr 26 09:07:19 2011] loadbalancer 192.168.3.207 823.821416
[Tue Apr 26 09:07:19 2011] [29624:3085998688] [info] jk_handler::mod_jk.c
(2364): Service error=0 for worker=loadbalancer
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker1) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.203:28009) is down (errno=104)
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker1) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker1) sending request to tomcat
failed (recoverable),  (attempt=2)
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [error]
ajp_service::jk_ajp_common.c (2204): (worker1) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [info] service::jk_lb_worker.c
(1168): service failed, worker worker1 is in error state
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [info] service::jk_lb_worker.c
(1245): All tomcat instances are busy or in error state
[Tue Apr 26 09:07:19 2011] loadbalancer 192.168.3.207 823.824435
[Tue Apr 26 09:07:19 2011] [29622:3085998688] [info] jk_handler::mod_jk.c
(2364): Service error=0 for worker=loadbalancer
[Tue Apr 26 09:07:30 2011] [29481:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:30 2011] [29481:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:30 2011] [29481:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:30 2011] [29541:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:30 2011] [29541:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:30 2011] [29541:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker1) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.203:28009) is down (errno=104)
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker1) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker1) sending request to tomcat
failed (recoverable),  (attempt=2)
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [error]
ajp_service::jk_ajp_common.c (2204): (worker1) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [info] service::jk_lb_worker.c
(1168): service failed, worker worker1 is in error state
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [info] service::jk_lb_worker.c
(1245): All tomcat instances are busy or in error state
[Tue Apr 26 09:07:31 2011] loadbalancer 192.168.3.207 835.813530
[Tue Apr 26 09:07:31 2011] [29593:3085998688] [info] jk_handler::mod_jk.c
(2364): Service error=0 for worker=loadbalancer
[Tue Apr 26 09:07:38 2011] [29712:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker1) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.203:28009) is down (errno=104)
[Tue Apr 26 09:07:38 2011] [29712:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker1) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:38 2011] [29712:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker1) sending request to tomcat
failed (recoverable),  (attempt=2)
[Tue Apr 26 09:07:38 2011] [29712:3085998688] [error]
ajp_service::jk_ajp_common.c (2204): (worker1) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
[Tue Apr 26 09:07:38 2011] [29712:3085998688] [info] service::jk_lb_worker.c
(1168): service failed, worker worker1 is in error state
[Tue Apr 26 09:07:40 2011] [29813:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:40 2011] [29813:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:40 2011] [29813:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:40 2011] [29814:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker2) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.204:28009) is down (errno=104)
[Tue Apr 26 09:07:40 2011] [29814:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker2) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:40 2011] [29814:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker2) sending request to tomcat
failed (recoverable),  (attempt=1)
[Tue Apr 26 09:07:41 2011] [29620:3085998688] [error]
ajp_connection_tcp_get_message::jk_ajp_common.c (1011): (worker1) can't
receive the response message from tomcat, network problems or tomcat
(192.168.3.203:28009) is down (errno=104)
[Tue Apr 26 09:07:41 2011] [29620:3085998688] [error]
ajp_get_reply::jk_ajp_common.c (1766): (worker1) Tomcat is down or refused
connection. No response has been sent to the client (yet)
[Tue Apr 26 09:07:41 2011] [29620:3085998688] [info]
ajp_service::jk_ajp_common.c (2186): (worker1) sending request to tomcat
failed (recoverable),  (attempt=2)
[Tue Apr 26 09:07:41 2011] [29620:3085998688] [error]
ajp_service::jk_ajp_common.c (2204): (worker1) Connecting to tomcat failed.
Tomcat is probably not started or is listening on the wrong port
[Tue Apr 26 09:07:41 2011] [29620:3085998688] [info] service::jk_lb_worker.c
(1168): service failed, worker worker1 is in error state
[Tue Apr 26 09:07:41 2011] loadbalancer 192.168.3.207 672.666233
[Tue Apr 26 09:08:05 2011] loadbalancer 192.168.3.207 0.005445
[Tue Apr 26 09:08:05 2011] loadbalancer 192.168.3.207 0.005088
[Tue Apr 26 09:08:05 2011] loadbalancer 192.168.3.207 0.005266/
 

 

Best regards,

Todor

 

 


[1]
http://tiainen.sertik.net/2011/03/load-balancing-with-glassfish-31-and.html

--
[Message sent by forum member 'windwalker78']
View Post: http://forums.java.net/node/795589