Problem with Overheating then reboot since Ubuntu 11.10

From ThinkWiki
Revision as of 11:22, 13 December 2012 by Kartoch (Talk | contribs)
Jump to: navigation, search

Symtoms

Since Ubuntu 11.10, my T500 has reduced battery life (around 3 hours with low screen power, 2 hours on normal usage) and reboots when CPU charge is important due to overheating (more than 100°). This is clearly a software bug, as I didn't have this behavior in Ubuntu 11.04. It appears since 11.10 (first ubuntu release with Linux kernel 3.x).

Trying to find a solution...

Fan control

To set your fan to max:

# sudo rmmod thinkpad_acpi # {{{1}}} # echo "level 127" > /proc/acpi/ibm/fan

But it is not a problem with fan control. Whatever is the fan speed (disengaged and set manually to full speed with level 127) my thinkpad T500 still reboots after less than one minute of high CPU (I didn't have this problem before Ubuntu 11.10).

Temporary fix

The temperature does not exceed 75° when my laptop is on battery and so is not rebooting because of the overheating.

Turbo mode

  • looking with powertop, I see
  * in frequency stats: the turbo mode of both CPU are usually equal to 33% (without no high CPU usage from my applications)
  * in device stats: lot of my PCI Device are running at 100%, which seems important

So I think we deal here with several bugs, one about the fan, but also one possibly with ASPM, which seems disabled on my computer:

$ dmesg | grep ASPM [ 0.160380] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it


Basically this is what I'm seeing on my i7 X220 - even though the CPU reaches 97 degrees with full speed fan - it stays in turbo mode no matter what, as verified with powertop. Maybe somebody now more about thinkpad throttling in turbo mode?

Clear the dust

Finally, I cannot say about openint the laptop and trying to remove the dust, as the computer belong to my work with guaranty yada yada.


ACPI

This might be related to this bug report: https://bugzilla.kernel.org/show_bug.cgi?id=42858

@Matthias the patch available in the comment number 5 of the bug report is already present in latest 12.04 kernel, so this is not the solution.

But strangely the symptoms seem very close. I've added a comment in the bug report:

https://bugzilla.kernel.org/show_bug.cgi?id=42858#8


Data

List of machines with the same problem
Model
T500 - type 2082

References

http://marc.info/?l=linux-acpi&m=132854533918079&w=2