C H A P T E R  3

Advanced Lights Out Manager

This chapter gives an overview of the Suntrademark Advanced Lights Out Manager (ALOM) software. The chapter covers the following topics:


Advanced Lights Out Manager Overview

The Netra 240 server is shipped with the Sun Advanced Lights Out Manager installed. The system console is directed to ALOM by default and is configured to show server console information on start-up.

ALOM enables you to monitor and control your server over either a serial connection (using the SERIAL MGT port) or an Ethernet connection (using the NET MGT port). For information on configuring an Ethernet connection, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server
(817-3174).



Note - The ALOM serial port, labeled SERIAL MGT, is for server management only. If you need a general-purpose serial port, use the serial port labeled 10101.



You can configure ALOM to send email notification of hardware failures and other events related to the server or to ALOM.

The ALOM circuitry uses standby power from the server, with the following results:

TABLE 3-1 lists the components that are monitored by ALOM and the information that the software provides for each component.


TABLE 3-1 Components Monitored by ALOM

Component

Information Provided

Hard drives

Presence and status

System and CPU fans

Speed and status

CPUs

Presence, temperature, and any thermal warning or failure conditions

Power supplies

Presence and status

System temperature

Ambient temperature and any thermal warning or failure conditions

Server front panel

Rotary switch position and LED status

Voltage

Status and thresholds

SCSI and USB circuit breakers

Status

Dry contact relay alarms

Status



ALOM Ports

The default management port is labeled SERIAL MGT. This port uses an RJ-45 connector and is for server management only; it supports only ASCII connections to an external console. Use this port the first time you operate the server.

Another serial port--labeled 10101--is available for general purpose serial data transfer. This port uses a DB-9 connector. For information about pinouts, refer to the Netra 240 Server Installation Guide (part number 817-2698).

In addition, the server has one 10BASE-T Ethernet management domain interface, labeled NET MGT. To use this port, ALOM configuration is required. For information, see the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).


Setting the admin Password

When you switch to the ALOM software after initial power-on, you see the sc> prompt. At this point, you can execute commands that require no user permissions. (Refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server, part number 817-3174, for a list of commands.) When you attempt to execute any command that requires user permissions, you are prompted to set a password for user admin.

single-step bulletIf you are prompted to do so, set a password for the admin user.

The password must contain the following:

Once the password is set, the admin user has full permissions and can execute all ALOM CLI commands. The user is prompted to log in with the admin password when subsequently switching to ALOM.


Basic ALOM Functions

This section covers some basic ALOM functions. For comprehensive documentation, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174) and the Netra 240 Server Release Notes (817-3142).


procedure icon  To Switch to the ALOM Prompt

single-step bulletAt a command prompt, type the following #. keystroke sequence:


# #.



Note - When you switch to the ALOM prompt, you are logged in with the userid admin. See Setting the admin Password.




procedure icon  To Switch to the Server Console Prompt

single-step bulletType:


sc> console

More than one ALOM user can be connected to the server console at a time, but only one user is permitted to type input characters to the console.

If another user is logged in and has write capability, you see the following message below after typing the console command:


sc> Console session already in use. [view mode]


procedure icon  To Take Console Write Capability Away From Another User

single-step bulletType:


sc> console -f


Automatic Server Restart



Note - Automatic System Recovery (ASR) is not the same as Automatic Server Restart, which the Netra 240 server also supports.



Automatic Server Restart is a component of ALOM. It monitors the Solaris OS while it is running and, by default, syncs the file systems and restarts the server if it fails.

ALOM uses a watchdog process to monitor the kernel only. ALOM does not restart the server if a process hangs and the kernel is still running. The ALOM watchdog parameters for the watchdog patting interval and the watchdog timeout are not user configurable.

If the kernel hangs and the watchdog times out, ALOM reports and logs the event and performs one of three user configurable actions:

For more information, see the sys_autorestart section of the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).

For instructions on using Automatic System Recovery (ASR), see Chapter 1.


Environmental Monitoring and Control

The Netra 240 server features an environmental monitoring subsystem designed to protect the server and its components against the following:

Monitoring and control capabilities are handled by the ALOM firmware, which ensures that monitoring capabilities remain operational even if the system has halted or is unable to boot. Also, monitoring the system from the ALOM firmware frees the system to dedicate CPU and memory resources to the operating system and application software.

The environmental monitoring subsystem uses an industry-standard I2C bus. The I2C bus is a simple two-wire serial bus used throughout the system to enable the monitoring and control of temperature sensors, fans, power supplies, status LEDs, and the front panel system control rotary switch.

The server contains three temperature sensors that monitor the ambient temperature of the server and the die temperature of the two CPUs. The monitoring subsystem polls each sensor and uses the sampled temperatures to report and respond to any overtemperature or undertemperature conditions. Additional I2C devices detect component presence and component faults.

The hardware and software together ensure that the temperatures within the enclosure do no exceed predetermined "safe operation" ranges. If the temperature observed by a sensor falls below a low-temperature warning threshold or rises above a high-temperature warning threshold, the monitoring subsystem software lights the system Service Required LEDs on the front and back panels. If the temperature condition persists and reaches a high or low soft shut-down temperature threshold, the system initiates a graceful system shut down. If the temperature reaches a high or low hard temperature threshold, the system initiates a forced system shut down.

Error and warning messages are sent to the system console and are logged in the /var/adm/messages file, and Service Required LEDs remain lit after an automatic system shutdown to aid in problem diagnosis.

The types of messages that are sent to the system console and are logged in the /var/adm/messages file depend on how you set the sc_clieventlevel and sys_eventlevel ALOM user variables. For information about setting these variables, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (817-3174).


TABLE 3-2 Netra 240 Server Enclosure Temperature Thresholds

Temperature Threshold

Temperature

Server Action

Low-temperature, hard shut down

-11°C

Server initiates a forced system shut down.

Low-temperature, soft shut down

-9°C

Server performs a graceful system shut down.

Low-temperature warning

-7°C

Server lights the system Service Required LED indicators on the front and back panels.

High-temperature warning

57°C

Server lights the system Service Required LED indicators on the front and back panels.

High-temperature, soft shut down

60°C

Server performs a graceful system shut down.

High-temperature, hard shut down

63°C

Server initiates a forced system shut down.


The monitoring subsystem is also designed to detect failures on the four-system blower. If any blower fails, the monitoring subsystem detects the failure and generates an error message to the system console, logs the message in the /var/adm/messages file, and lights the Service Required LEDs.

The power subsystem is monitored in a similar manner. Polling the power supply status occasionally, the monitoring subsystem indicates the status of each supply's outputs, inputs, and presence.

If a power supply problem is detected, an error message is sent to the system console and is logged in the /var/adm/messages file. Additionally, LEDs located on each power supply light to indicate failures. The system Service Required LED lights to indicate a system fault. The ALOM console alerts record power supply failures.

Use the showenvironment ALOM command to view the warning thresholds of the power subsystem and the fan speeds. For instructions on using this command, refer to the Sun Advanced Lights Out Manager Software User's Guide for the Netra 240 Server (part number 817-3174).