Slightly similar but not exactly as T125205 (as that one is only concerned with the BMC's sensors): we should monitor the BMC (whether it's the IPMI SEL or the HP's IML or similar things) for certain critical events. Consider this, from the T130702 investigation:
root@es2017:~# ipmitool sel list 1 | 02/08/2016 | 16:06:18 | Event Logging Disabled #0x72 | Log area reset/cleared | Asserted 2 | 05/26/2016 | 12:22:06 | Processor #0x61 | IERR | Asserted 3 | 05/26/2016 | 12:24:04 | Unknown #0x2e | | Asserted 4 | 05/26/2016 | 12:24:04 | Memory #0x02 | Uncorrectable ECC (UnCorrectable ECC | DIMMA2) | Asserted