12-07-05 12:52 PM
On Wed, 07 Dec 2005 02:25:17 -0800, Ian East <ian.east@gmail> wrote:
>
>Hi,
>
>I have 10 machines in a cluster. All are exactly the same hardware
>and running debian-sarge. For 9 of them, the baseline stats are
>within about 5-10% of each other which is fairly normal. However, one
>of them has a CPU utilization and load average 10 times higher than
>the others. Upon some investigation with vmstat, I discovered this
>machine has an interrupt rate about 4 times as high as the others.
>
>My question is, how can I troubleshoot the device that's causing this
>problem? I checked all of the parameters with sysctl and nothing is
>too out of the ordinary. The vmstat parameters were also all
>resonably close aside from CPU utilization and interrupt rate. Even
>when the machine is relatively idle the CPU still hovers around 35%
>use by system processes. The other machines would be less than 1%
>utilized.
>
>This is really driving me crazy and I need to know if it's a hardware
>problem so I can return it before the warranty expires.
>
>Thanks for any help.
I have discovered sysstat and have a little more info. I take it this
machine is toast. Both machines were practically idle.
This is a normal machine:
#sar -u -I XALL 30 1
Linux 2.4.26-1-686-smp (cow25) 12/07/05
Average: CPU %user %nice %system %iowait %idle
Average: all 0.50 0.00 0.50 0.00 99.00
Average: INTR intr/s
Average: 14 5.20
Average: 54 593.00
Average: 55 347.00
This is the funky machine:
Average: CPU %user %nice %system %iowait %idle
Average: all 1.46 0.00 38.67 0.00 59.87
Average: INTR intr/s
Average: 14 4.60
Average: 16 49268.90
Average: 18 49816.20
Average: 19 49961.90
Average: 54 341781.00
Average: 55 342663.20
Here are the devices... The machines are identical:
# lspci -v
0000:00:00.0 Host bridge: Intel Corp. Server Memory Controller Hub
(rev 0c)
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, fast devsel, latency 0
Capabilities: [40] #09 [4105]
0000:00:00.1 ff00: Intel Corp. Memory Controller Hub Error Reporting
Register (rev 0c)
Subsystem: Intel Corp.: Unknown device 1079
Flags: fast devsel
0000:00:01.0 System peripheral: Intel Corp. Memory Controller Hub DMA
Controller (rev 0c)
Subsystem: Intel Corp.: Unknown device 1079
Flags: fast devsel, IRQ 16
Memory at fcdff000 (32-bit, non-prefetchable) [disabled] [size=4K]
Capabilities: [b0] Message Signalled Interrupts: 64bit- Queue=0/1
Enable-
0000:00:02.0 PCI bridge: Intel Corp. Memory Controller Hub PCI Express
Port A0 (rev 0c) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=01, subordinate=03, sec-latency=0
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: fce00000-fcffffff
Capabilities: [50] Power Management version 2
Capabilities: [58] Message Signalled Interrupts: 64bit- Queue=0/1
Enable-
Capabilities: [64] #10 [0041]
0000:00:1d.0 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB
UHCI #1 (rev 02) (prog-if 00 [UHCI])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, medium devsel, latency 0, IRQ 16
I/O ports at c800 [size=32]
0000:00:1d.1 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB
UHCI #2 (rev 02) (prog-if 00 [UHCI])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, medium devsel, latency 0, IRQ 19
I/O ports at c880 [size=32]
0000:00:1d.2 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB
UHCI #3 (rev 02) (prog-if 00 [UHCI])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, medium devsel, latency 0, IRQ 18
I/O ports at cc00 [size=32]
0000:00:1d.7 USB Controller: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2
EHCI Controller (rev 02) (prog-if 20 [EHCI])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, medium devsel, latency 0, IRQ 23
Memory at fcdfec00 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] #0a [20a0]
0000:00:1e.0 PCI bridge: Intel Corp. 82801 PCI Bridge (rev c2)
(prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=04, subordinate=04, sec-latency=32
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: fd000000-febfffff
0000:00:1f.0 ISA bridge: Intel Corp. 82801EB/ER (ICH5/ICH5R) LPC
Bridge (rev 02)
Flags: bus master, medium devsel, latency 0
0000:00:1f.2 IDE interface: Intel Corp. 82801EB (ICH5) Serial ATA 150
Storage Controller (rev 02) (prog-if 8a [Master SecP PriP])
Subsystem: Intel Corp.: Unknown device 3437
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 18
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at <unassigned>
I/O ports at fc00 [size=16]
0000:00:1f.3 SMBus: Intel Corp. 82801EB/ER (ICH5/ICH5R) SMBus
Controller (rev 02)
Subsystem: Intel Corp.: Unknown device 1079
Flags: medium devsel, IRQ 17
I/O ports at 0540 [size=32]
0000:01:00.0 PCI bridge: Intel Corp. PCI Bridge Hub A (rev 09)
(prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=01, secondary=02, subordinate=02, sec-latency=64
Capabilities: [44] #10 [0071]
Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0
Enable-
Capabilities: [6c] Power Management version 2
Capabilities: [d8] PCI-X bridge device.
0000:01:00.1 PIC: Intel Corp. PCI Bridge Hub I/OxAPIC Interrupt
Controller A (rev 09) (prog-if 20 [IO(X)-APIC])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, fast devsel, latency 0
Memory at fcefe000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] #10 [0001]
Capabilities: [6c] Power Management version 2
0000:01:00.2 PCI bridge: Intel Corp. PCI Bridge Hub B (rev 09)
(prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=01, secondary=03, subordinate=03, sec-latency=64
I/O behind bridge: 0000d000-0000dfff
Memory behind bridge: fcf00000-fcffffff
Capabilities: [44] #10 [0071]
Capabilities: [5c] Message Signalled Interrupts: 64bit+ Queue=0/0
Enable-
Capabilities: [6c] Power Management version 2
Capabilities: [d8] PCI-X bridge device.
0000:01:00.3 PIC: Intel Corp. PCI Bridge Hub I/OxAPIC Interrupt
Controller B (rev 09) (prog-if 20 [IO(X)-APIC])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, fast devsel, latency 0
Memory at fceff000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [44] #10 [0001]
Capabilities: [6c] Power Management version 2
0000:03:04.0 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
Controller (rev 03)
Subsystem: Intel Corp. PRO/1000 MT Dual Port Network Connection
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 54
Memory at fcfa0000 (64-bit, non-prefetchable) [size=128K]
I/O ports at d880 [size=64]
Capabilities: [dc] Power Management version 2
Capabilities: [e4] PCI-X non-bridge device.
Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0
Enable-
0000:03:04.1 Ethernet controller: Intel Corp. 82546GB Gigabit Ethernet
Controller (rev 03)
Subsystem: Intel Corp. PRO/1000 MT Dual Port Network Connection
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 55
Memory at fcfe0000 (64-bit, non-prefetchable) [size=128K]
I/O ports at dc00 [size=64]
Capabilities: [dc] Power Management version 2
Capabilities: [e4] PCI-X non-bridge device.
Capabilities: [f0] Message Signalled Interrupts: 64bit+ Queue=0/0
Enable-
0000:04:0c.0 VGA compatible controller: ATI Technologies Inc Rage XL
(rev 27) (prog-if 00 [VGA])
Subsystem: Intel Corp.: Unknown device 1079
Flags: bus master, stepping, medium devsel, latency 64, IRQ 17
Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
I/O ports at e800 [size=256]
Memory at febff000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at febc0000 [disabled] [size=128K]
Capabilities: [5c] Power Management version 2
[ Post a follow-up to this message ]
|