Print View

ESX 3 Proliant Server ASR's with HP Insight Agents

Issue

In the Integrated Management Log:

"Critical ASR mm/dd/yyyy hh:mm mm/dd/yyyy hh:mm 1 ASR Detected by System ROM"

In the iLO2 Log:

"BMC IPMI Watchdog Timer Timeout: Action=System Power Reset"

The 7.7.0 agents are supposed to resolve this issue from 7.6.0b:

"Fixed an issue wherein the storage agents consumed excessive CPU time, potentially resulting in server reboots (ASRs). The CCISS device nodes are now kept open by default on all servers to workaround this issue"

This has previously been an issue with the Insight Agents:

"Advisory: (Revision) Automatic Server Recovery (ASR) Reset May Occur After Installing HP Insight Management Agents Version 7.5.1a on VMware ESX Server 3.0 Running on ProLiant Server that Supports IPMI"

Resolution

1) Disable HP Storage Agents

This will remove the storage agent options from the Insight Agents as below:

Before being disabled
http://theether.net/images/StorageAgentEnabled.png

After being disabled
http://theether.net/images/StorageAgentDisabled.png

From the ESX Service Console run the following command:

service hpasm reconfigure


Press <Enter> to all questions except the following which should be answered no:

Storage Agent Startup Policy
Do you require Storage Agent support (y/n) ? (Blank is y): n

2) Scan for new storage devices and new VMFS volumes seperately

From the VirtualCenter Client:

- Select the ESX host
- Select the Configuration tab
- Select Storage Adapters
- Click Rescan in the top right
- Uncheck Scan for New VMFS Volumes and click OK
- Once the scan has completed click Rescan again
- Uncheck Scan for New Storage Devices and click OK

NB: This is unconfirmed by VMware

"If you are using VI client 2.0.1 the rescan san will scan for both new storage adapters and new vmfs volumes at the same time which causes LUN thrashing and hangs the server. In version 2.0 it scans them one at a time."
http://virtrix.blogspot.com/2006/12/vmware-esx-freeze-on-san-rescan.html

NB: This is now confirmed by VMware (24-12-2007)

ESX Server Host and Virtual Machines Not Responding after Clicking the Rescan Button to Scan for New Storage Devices
http://kb.vmware.com/kb/10229

ESX Server 3.0.1, Patch ESX-1000039: vmkernel Update
http://kb.vmware.com/kb/1000039

3) Disable USB Devices in BIOS
There have been reports that disabling USB on the server has resolved this issue:

"We found that disabling USB in BIOS caused the ASR's to stop."
"We've been running for almost 3 weeks without issue since disabling USB. Prior, we consistently saw ASRs between 1-5 days uptime."
http://www.vmware.com/community/thread.jspa?threadID=56837&start=50&tstart=0

References

Managing ProLiant servers with Linux
http://theether.net/download/HP/Insight%20Agents/c00223285.pdf

HP Management Agents for VMware ESX Server 3.x version 7.7.0 (21 Dec 06)
http://h18023.www1.hp.com/support/files/server/us/download/26407.html

Advisory: (Revision) Automatic Server Recovery (ASR) Reset May Occur After Installing HP Insight Management Agents Version 7.5.1a on VMware ESX Server 3.0 Running on ProLiant Server that Supports IPMI
http://h20000.www2.hp.com/bizsupport/TechSupport/Document.jsp?objectID=c00748635

Products

VMware ESX 3.0.1
VMware ESX 3
HP Insight Agents 7.7.0
HP Insight Agents 7.6.0b
HP Insight Agents 7.5.1a

Created: 8th May 2007
Updated: 11th February 2008

Print View

© 2005-2024 Jamie Morrison