Albert Chu
2014-09-26 18:11:38 UTC
Hello,
Once in awhile I get a bug report to FreeIPMI saying, "ipmitool outputs
that a sensor says 'ok', but FreeIPMI outputs that something is wrong
with a sensor."
After investigation, it appears that ipmitool's output with 'sdr list'
lists 'ok' for many (all?) discrete sensors regardless of the contents
of that reading. As long as the reading is available and valid, it
outputs "ok". For example, here's a power supply sensor I got on a node
here.
PSU 1 Status | 0x0b | ok
0x0b is not good for this sensor, you usually want to see 0x00 or 0x01.
Yet it still says 'ok'.
In FreeIPMI, it states
54 | PSU 1 Status | Power Supply | N/A | N/A | 'Presence detected' 'Power Supply Failure detected' 'Power Supply input lost (AC/DC)'
So there's something wrong w/ this power supply or its atleast worth
investigating for the staff.
At minimum, the "ok" output appears to confuse some users. At worst,
some users may think the sensor readings are good when they are in fact
not. Perhaps an output of "N/A" would be more appropriate for discrete
sensors in this case?
Obviously there's a lot of history in this output with ipmitool, but I
thought I'd mention it for discussion.
Al
Once in awhile I get a bug report to FreeIPMI saying, "ipmitool outputs
that a sensor says 'ok', but FreeIPMI outputs that something is wrong
with a sensor."
After investigation, it appears that ipmitool's output with 'sdr list'
lists 'ok' for many (all?) discrete sensors regardless of the contents
of that reading. As long as the reading is available and valid, it
outputs "ok". For example, here's a power supply sensor I got on a node
here.
PSU 1 Status | 0x0b | ok
0x0b is not good for this sensor, you usually want to see 0x00 or 0x01.
Yet it still says 'ok'.
In FreeIPMI, it states
54 | PSU 1 Status | Power Supply | N/A | N/A | 'Presence detected' 'Power Supply Failure detected' 'Power Supply input lost (AC/DC)'
So there's something wrong w/ this power supply or its atleast worth
investigating for the staff.
At minimum, the "ok" output appears to confuse some users. At worst,
some users may think the sensor readings are good when they are in fact
not. Perhaps an output of "N/A" would be more appropriate for discrete
sensors in this case?
Obviously there's a lot of history in this output with ipmitool, but I
thought I'd mention it for discussion.
Al
--
Albert Chu
***@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
Albert Chu
***@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory