Discussion:
[Ipmitool-devel] Reading GPU temperature via IPMI
Vahid
2010-12-29 07:09:00 UTC
Permalink
Hello,
I can read CPU_TEMP, CPU_FAN,... sensors, is it possible to add GPU
temperature or GPU fan speed to IPMI ?
Thanks
__
Vahid
Carsten Aulbert
2010-12-29 09:18:27 UTC
Permalink
Hi
Post by Vahid
I can read CPU_TEMP, CPU_FAN,... sensors, is it possible to add GPU
temperature or GPU fan speed to IPMI ?
I'm no developer only a user, but I doubt it. I'm not aware of any
standardized interface for GPU temperatures. We - for example - need to use
nvidia's smi to gather those.

gpu010:~# nvidia-smi --query-gpu-info --gpu=0

==============NVSMI LOG==============


Timestamp : Wed Dec 29 10:17:20 2010

Driver Version : 260.19.21


GPU 0:
Product Name : Tesla C2050
PCI Device/Vendor ID : 6d110de
PCI Location ID : 0:A:0
Board Serial : 0321610134404
Display : Not connected
Temperature : 68 C
Fan Speed : 30%
Utilization
GPU : 0%
Memory : 0%
Volatile ECC errors :
Single bit :
FB : 0
RF : 0
L1 : 0
L2 : 0
Total : 0
Double bit :
FB : 0
RF : 0
L1 : 0
L2 : 0
Total : 0
Aggregate ECC errors :
Single bit :
Total : 0
Double bit :
Total : 0

But maybe the developers know about a way to "link" these, but I fear this
needs a standard which does not yet exists...

Cheers

Carsten
Ralf Utermann
2011-01-10 14:16:01 UTC
Permalink
Carsten Aulbert schrieb:
[...]
Post by Carsten Aulbert
Driver Version : 260.19.21
Product Name : Tesla C2050
PCI Device/Vendor ID : 6d110de
PCI Location ID : 0:A:0
Board Serial : 0321610134404
Display : Not connected
Temperature : 68 C
Fan Speed : 30%
[...]

I don't see the temperature on our Supermicro 6016GT-TF-FM205 systems
with 2 M2050:
# nvidia-smi --query-gpu-info --gpu=0

Driver Version : 260.19.12

GPU 0:
Product Name : Tesla M2050
PCI Device/Vendor ID : 6de10de
PCI Location ID : 0:2:0
Board Serial : 0322310084062
Display : Not connected
Utilization
GPU : 99%
...

could this be a driver version problem?

ipmitool does show GPU temperatures:
# ipmitool sdr type Temperature|grep GPU
GPU1 Temp | 18h | ok | 7.1 | 71 degrees C
GPU2 Temp | 19h | ok | 7.1 | 52 degrees C

but does not show CPU temperatures:

# ipmitool sdr type Temperature|grep CPU
# ipmitool sensor |grep CPU
CPU1 Temp | 0x0 | discrete | 0x0000| na | na | na | na | na | na
CPU2 Temp | 0x0 | discrete | 0x0000| na | na | na | na | na | na

This is on openSUSE 11.2.

@Ryan: Do you still see CPU temperatures on your Dell PowerEdge?
Wich OS and ipmitool?

Best regards, Ralf
--
Ralf Utermann
_____________________________________________________________________
Universität Augsburg, Institut für Physik -- EDV-Betreuer
Universitätsstr.1
D-86135 Augsburg Phone: +49-821-598-3231
SMTP: ***@Physik.Uni-Augsburg.DE Fax: -3411
Al Chu
2011-01-03 17:48:32 UTC
Permalink
I don't believe there is anything in the IPMI spec that would stop a
hardware manufacturer from supporting GPU temperatures via IPMI.
However, the manufacturer would have to support it. If they don't,
there's likely not much you can do about it.

Al
Post by Vahid
Hello,
I can read CPU_TEMP, CPU_FAN,... sensors, is it possible to add GPU
temperature or GPU fan speed to IPMI ?
Thanks
__
Vahid
--
Albert Chu
***@llnl.gov
Computer Scientist
High Performance Systems Division
Lawrence Livermore National Laboratory
Ryan Cox
2011-01-03 15:44:11 UTC
Permalink
On a Dell PowerEdge M610x blade you can do it with "ipmitool sdr type
Temperature" or "ipmitool sdr entity 26.6". It shows up as "GPU
Temp1". I haven't checked it for accuracy though.
Post by Vahid
Hello,
I can read CPU_TEMP, CPU_FAN,... sensors, is it possible to add GPU
temperature or GPU fan speed to IPMI ?
Thanks
__
Vahid
------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and,
should the need arise, upgrade to a full multi-node Oracle RAC database
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Ipmitool-devel mailing list
https://lists.sourceforge.net/lists/listinfo/ipmitool-devel
--
Ryan Cox
Systems Administrator
Fulton Supercomputing Lab
Brigham Young University
Loading...