users@glassfish.java.net

Re: glassfish 3.1.1, VPS (max sockets 1024 + 1024 local sockets), config suggestions to avoid Too many open files problem

From: Mladen Adamovic <mladen.adamovic_at_gmail.com>
Date: Tue, 24 Jul 2012 00:08:09 +0200

>
>
> When you disable keep-alive connections, does it change anything w.r.t. a
> number of CLOSE_WAIT connections?
>
>
Well, I cannot always reproduce the problem, the last time it ran for 12
hours before crashing. Note that this is production server so it doesn't
make sense for me to run "experiments against the real user base".

Currently the webserver is running with modified configuration, which
includes keep-alive max requests to 20, connection time out to 15 seconds
and this is how number of connections did look like recently:
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
    105 ESTABLISHED
      1 FIN_WAIT1
     42 FIN_WAIT2
      2 LAST_ACK
     10 LISTEN
      4 SYN_RECV
    112 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
    102 ESTABLISHED
      1 FIN_WAIT1
     33 FIN_WAIT2
      2 LAST_ACK
     10 LISTEN
      4 SYN_RECV
    123 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     88 ESTABLISHED
      1 FIN_WAIT1
     20 FIN_WAIT2
      2 LAST_ACK
     10 LISTEN
      8 SYN_RECV
    137 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -a | grep "CLOSE_WAIT" | wc -l
0
root_at_lvps176-28-13-94:~# netstat -a | grep "CLOSE_WAIT" | wc -l
0
root_at_lvps176-28-13-94:~# netstat -a | grep "CLOSE_WAIT" | wc -l
2
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
      2 CLOSE_WAIT
     72 ESTABLISHED
      2 FIN_WAIT1
     39 FIN_WAIT2
     10 LISTEN
      2 SYN_RECV
    110 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     35 ESTABLISHED
      7 FIN_WAIT2
     10 LISTEN
     31 SYN_RECV
     88 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
      1 CLOSE_WAIT
     70 ESTABLISHED
     26 FIN_WAIT2
      1 LAST_ACK
     10 LISTEN
     18 SYN_RECV
     88 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     71 ESTABLISHED
      2 FIN_WAIT1
     28 FIN_WAIT2
     10 LISTEN
      6 SYN_RECV
    128 TIME_WAIT
root_at_lvps176-28-13-94:~#


OK, now checking with max request per keep alive connection set to 1:
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     28 ESTABLISHED
      1 FIN_WAIT1
      3 FIN_WAIT2
     10 LISTEN
      5 SYN_RECV
    241 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     17 ESTABLISHED
      2 FIN_WAIT1
      5 FIN_WAIT2
     10 LISTEN
      5 SYN_RECV
    250 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
      1 CLOSING
     30 ESTABLISHED
      9 FIN_WAIT2
     10 LISTEN
      5 SYN_RECV
    267 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     29 ESTABLISHED
      6 FIN_WAIT1
     15 FIN_WAIT2
     10 LISTEN
      5 SYN_RECV
    221 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     26 ESTABLISHED
      8 FIN_WAIT1
     16 FIN_WAIT2
     10 LISTEN
      6 SYN_RECV
    227 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
     27 ESTABLISHED
      3 FIN_WAIT1
     19 FIN_WAIT2
     10 LISTEN
     10 SYN_RECV
    183 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
      1 CLOSE_WAIT
     21 ESTABLISHED
      4 FIN_WAIT1
     21 FIN_WAIT2
     10 LISTEN
     10 SYN_RECV
    196 TIME_WAIT
root_at_lvps176-28-13-94:~# netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c
      1 CLOSE_WAIT
     32 ESTABLISHED
      4 FIN_WAIT1
     15 FIN_WAIT2
     10 LISTEN
      7 SYN_RECV
    239 TIME_WAIT


Looks worser. Around ~200 (oscillating) connections in TIME_WAIT state.
When not using keep-alive many connections are created and dropped and
sometimes connection can stay in TIME_WAIT for 240ms. It's easy to go out
of network sockets, it seems.




Thanks.
>
> WBR,
> Alexey.
>
>
>
> Hm, I've seen one problem with the Virtuozzo server in the log:
> quotaugidlimit
> Number of user/group IDs allowed for the Container internal disk quota. If
> set to 0, UID/GID quota will not be enabled.
>
> This is set to limit 2000, but I don't understand what it has with
> files, it shall be number of UID/GIDs, and at the moment is has been steady
> at 43 at VPS.
> Perhaps these are problem with Virtuozzo setup but it has to be proven
> somehow.
>
>
>
> --
> Mladen Adamovic
> Numbeo
> Drziceva 9, 11120 Belgrade-Zvezdara, Serbia
> (Business Registration Number 62612240)
> Tel. +381-66-058-595
> email: mladen.adamovic_at_gmail.com
> web: http://www.numbeo.com
>
>
>


-- 
Mladen Adamovic
Numbeo
Drziceva 9, 11120 Belgrade-Zvezdara, Serbia
(Business Registration Number 62612240)
Tel. +381-66-058-595
email: mladen.adamovic_at_gmail.com
web: http://www.numbeo.com