Sun Gathering Debug Data for Sun Java System Web Server

ProcedureTo Gather Debug Data on a Hung or Unresponsive Web Server Process

A process hang is defined as one of the Web Server processes not responding to requests while the httpd process is still running.

Before You Begin

Make sure that you collect all the data over the same time frame in which the problem occurs. See 1.6 Configuring Solaris to Generate Core Files if a core file is not generated.

Gather the following information for process hang problems. Run the commands in the order when the problem occurs. Be sure to specify the time when the process hanged and list the affected processes, if possible.

  1. Gather the general system information as explained in To Gather General Debug Data for Any Web Server Problem.

  2. For Solaris, use the ptree command on the uxwdog process to find about the process.


    Note –

    If you are using Web Server 6.1 or Web Server 7.0, instead of the uxwdog process, use the webservd-wdog process.


    Output


    ptree 11449
    11449 ./uxwdog -d /prods/crypto/60SP6/https-sun/config
     11450 ns-httpd -d /prods/crypto/60SP6/https-sun/config
      11451 ns-httpd -d /prods/crypto/60SP6/https-sun/config

    Note –

    Gather the data on the highest PID process, which in this example is 11451. The Web Process is either ns-httpd or webservd, depending on the Web Server version.


  3. Run the netstat command and save the output.

    UNIX (Solaris and HP-UX) and Linux

    netstat -an | grep web-port

    Windows

    netstat -an

  4. (For Solaris), wshang script captures the debug data.

    The wshang script is available at: http://www.sun.com/bigadmin/scripts/indexSjs.html

    Run the script pkg_app on one of the core file generated by the wshang script. For more information on how to run thewshang script, see To Run the wshang Script.

  5. Run the following commands and save the output.

    Solaris
    ps -aux
    server-root
    vmstat 5 5
    iostat [ -t ] [ interval [ count ] ]
    top
    uptime
    
    HP-UX
    ps -aux
    server-root
    vmstat 5 5
    iostat [ -t ] [ interval [ count ] ]
    top
    sar
    
    Linux
    ps -aux
    server-root
    vmstat 5 5
    top
    uptime
    sar
    
    Windows

    Obtain the WEB process PID:

    C:\windbg-root>tlist.exe

    Obtain the process details of the WEB running process PID:

    C:\windbg-root>tlist.exe web-pid

  6. Get the swap information.

    Solaris

    swap -l

    HP-UX

    swapinfo

    Linux

    free

    Windows

    Already provided in C:\report.txt as described in To Gather General Debug Data for Any Web Server Problem.

  7. If the Web Server uses a Directory Server, provide the access, errors and audit logs of the Directory Server used by the Web Server.

    • Access log

      UNIX (Solaris and HP-UX) and Linux

      server-root/slapd-identifier/logs/access

      Windows

      server-root\slapd-identifier\logs\access

    • Errors log

      UNIX (Solaris and HP-UX) and Linux

      server-root/slapd-identifier/logs/errors

      Windows

      server-root\slapd-identifier\logs\errors

    • Audit log

      UNIX (Solaris and HP-UX) and Linux

      server-root/slapd-identifier/logs/audit

      Windows

      server-root\slapd-identifier\logs\audit


    Note –

    The paths of these logs files are specified by the following parameters in the dse.1dif file.nsslapd-accesslog,nsslap-errorlog, and nsslapd-auditlog

    The dse.1dif file is located in the config directory.

    UNIX (Solaris and HP-UX) and Linux

    server-root/slapd-identifier/config/dse.ldif

    Windows

    server-root\slapd-identifier\config\dse.ldif


  8. (For Solaris) If you are able to isolate the hanging process, get the following debug data for that process. Otherwise, get the following data for each of the Web Server processes.

    For Solaris

    Using the PID obtained in Step 3, get a series of five of the following commands (one every 10 seconds) :

    pstack web-pid

    pmap -x web-pid

    Additionally, get the outputs of the following commands:

    prstat -L -p web-pid

    pfiles web-pid

    pmap web-pid

  9. Search for any core file that could have been dumped by one of the Web Server processes. If you find one, see To Gather Debug Data on Web Server Crashed Process.

  10. Get the output of the following command.

    Solaris

    truss -ealf -rall -wall -vall -o /tmp/WEBProc-PID -p web-pid

    HP-UX

    tusc -v -fealT -rall -wall -o /tmp/WEBProc-PID -p web-pid

    Linux

    strace -fv -o /tmp/WEBProc-PID.strace -p web-pid

    Windows

    Use DebugView tool. You can download this tool from http://www.sysinternals.com/Utilities/DebugView.html


    Note –

    Wait for a minute after launching the appropriate command (truss, strace, tusc, or DebugView) then stop it by pressing Control+C in the terminal where you launched the command.


  11. Get core files and the output of the following commands.

    If a process hangs, it is helpful to compare several core files to review the state of the threads over time. Make a copy of the core file to a new name, wait for approximately one minute then rerun the following commands, so that the core files are not overwritten. Do this three times to obtain three core files.


    Note –

    For HP-UX, you need PHKL_31876 and PHCO_32173 patches to use the gcore command. If you cannot install these patches, use the HP-UX /opt/langtools/bin/gdb command from version 3.2 and later, or the dumpcore command.


    Solaris

    cd server-root/bin/https/bin;gcore -o /tmp/web-core web-pid;pstack /tmp/web-core

    HP-UX

    # cd server-root/bin/https/bin
    gcore -p web-pid
    (gdb) attach web-pid
    Attaching to process web-pid
    No executable file name was specified
    (gdb) dumpcore
    Dumping core to the core file core.web-pid
    (gdb) quit
    The program is running. Quit anyway (and detach it)? (y or n) y
    Detaching from program: , process web-pid
    

    Note –

    The core.web-pid should be generated in the web-identifier/config directory.


    Linux

    # cd server-root/bin/https/bin
    gdb
    (gdb) attach web-pid
    Attaching to process web-pid
    No executable file name was specified
    (gdb) gcore
    Saved corefile core.web-pid
    
    (gdb)backtrace
    (gdb)quit
    
    Windows

    Get the WEB process PID:

    C:\windbg-root>tlist.exe

    Generate a crash dump on the WEB running process PID:

    C:\windbg-root>adplus.vbs -hang -p web-pid -o C:\crashdump_dir


    Note –

    For Windows, provide the complete generated folder under C:\crashdump_dir.


  12. For Solaris, Archive the result of the script pkg_app (at least one core file is required).

    ./pkg_app.ksh -c [pid-of-application or corefile] -p <full path to process binary of webservd>

     
    

    The Sun Support Center requires the output from the pkg_app script to properly analyze the core file(s). For more information on how to run the pkg_app script, see To Run the pkg_app Script


    Note –

    Make sure that the appropriate limitations are set by using the ulimit command, and that the user is not nobody. Also check the coreadm command for additional control. See 1.6 Configuring Solaris to Generate Core Files if a core file is not generated.



    Note –

    If you are using Web Server 6.1 or Web Server 7.0, do not proceed further with the next step.


  13. For UNIX and Linux, If JVM is used for the Web applications, provide the JVM Stack traces during a hang situation.

    A series of three to five Stack traces will be required.

    To enable thread dumps for version 6.0, perform the following steps:

    1. Edit the configuration file

      server-root/https-host/obj.conf

    2. Modify the following line

      Init fn="NSServletLateInit" LateInit=yes

      to

      Init LateInit="yes" fn="NSServletInit" CatchSignals="yes" Signals=SIGQUIT

    3. Add or modify the following line in /server-root/https-host/jvm12.conf

      jvm.printerrors=1

    4. Restart Web Server.


    Note –

    When a problem occurs during a restart, issuing a kill —3 against the process dumps the stack traces into the Web Server errors log.