How to make use of Dynamic Frequency Scaling

From ThinkWiki
Revision as of 11:00, 26 June 2007 by Benjamin Pineau (Talk | contribs) (A note about CPU throttling: link for the intel paper)
Jump to: navigation, search

General

Linux supports dynamic frequency scaling for systems with the following processors:

Configuring the Kernel

2.4 Kernels

There were various frequency scaling implementations in the 2.4 series of kernels. They all were preliminary and a standard was rised with the introduction of the sysfs filesystem in 2.6 kernels. It is recommended to use a 2.6 kernel, if possible.

2.6 Kernels

You need to enable the cpu frequency scaling for your kernel (usually your distros kernel will have this enabled):

<*> (CONFIG_CPU_FREQ)

You need to enable governors, if not already done in your distros default kernel:

<*> (CONFIG_CPU_FREQ_GOV_PERFORMANCE)
<*> (CONFIG_CPU_FREQ_GOV_POWERSAVE)
<*> (CONFIG_CPU_FREQ_GOV_USERSPACE)

Since 2.6.10 there is the ondemand governor that does cpu frequency scaling in kernel and can be used as an alternative to powernowd etc. It can be enabled with:

<*> (CONFIG_CPU_FREQ_GOV_ONDEMAND)

Since 2.6.12 there is the conservative governor that works similar to the ondemand governor.

<*> (CONFIG_CPU_FREQ_GOV_CONSERVATIVE)

ondemand and conservative differ in the way they scale up and down. The ondemand governor switches to the highest frequency immediately when there is load, while the conservative governor increases frequency step by step. Likewise they behave the other way round for stepping down frequency when the CPU is idle.

If you own a Dothan processor, you need to enable Enhanced SpeedStep functionalities.

<*> (CONFIG_X86_SPEEDSTEP_CENTRINO)

Alternatively, there seems to be some advocacy of a switch over to

<*> (CONFIG_X86_ACPI_CPUFREQ)

for controlling speedstep technology. Apparently this newer kernel option is more aware of acpi powersaving that's done in the BIOS. In any case, both <*> (CONFIG_X86_SPEEDSTEP_CENTRINO) and <*> (CCONFIG_X86_ACPI_CPUFREQ) were found to work on a Menrom (Core 2).

2.6 doing it with modules

With the Debian flavour of 2.6.21 and possibly earlier versions as well as other distros, all of the above kernel options are compiled as modules out of the box. You'll have to load them yourself to get speedstep functionality. This can be done simply enough in an /etc/modules file with the lines

acpi-cpufreq
cpufreq_ondemand
cpufreq_userspace
cpufreq_conservative
cpufreq_powersave

of course, this is a bit excessive if you're only going to use one governor (see below), you only need to load the modules for the governor(s) you are going to use. It seems that "performance" is not a loadable module in my case. Probably because it's built into the kernel as the default. Finally if you're a fan of useless statistics, you can load cpufreq_stats so that you can see how long your processor spends in each state and how many times it transitions with

cat /sys/devices/system/cpu/cpu0/cpufreq/stats/total_trans
cat /sys/devices/system/cpu/cpu0/cpufreq/stats/time_in_state

Using the Sys Interface

The files in /sys/devices/system/cpu/cpu0/cpufreq/ provide information and a means of controlling the frequency scaling subsystem. Seed values are given in Khz. You need to be root to access the /sys filesystem.

Your max speed is at /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq.

# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
700000

Your min speed is at /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq.

# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
500000

If you are using the userspace governor, you can write to /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed to change the current speed.

# echo 700000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
# cat /proc/cpuinfo
cpu MHz  : 697.252
# echo 900000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
# cat /proc/cpuinfo
cpu MHz  : 976.152

Using Frequency Scaling Governors

You can compile the scaling governors into your kernel or compile it as module. You'll find the governors with 'make menuconfig' here:

Power management options (ACPI, APM) → CPU Frequency scaling →

After booting the new kernel you can get a list of available governors with (as root):

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors
conservative ondemand powersave userspace performance

Note: If the governors are compiled as modules, load them first:

# modprobe cpufreq_performance cpufreq_ondemand cpufreq_conservative cpufreq_powersave cpufreq_userspace

A Short Overview over the available governors:

ondemand
A dynamic cpufreq policy governor; it changes frequency based on the processor load. It may not work on older laptops without Enhanced SpeedStep due to latency reasons. Anyway, for recent enough Intel CPU, it's the one recommended for power efficiency (over userspace, and even over "powersave") by the Intel's kernel developer Arjan van de Ven (see [1], [2], [3])
conservative
New since 2.6.12. Similar to ondemand but has a much slower 'slew rate', remaining at high frequency for many seconds after recent processor demand. Good for battery powered environments and AMD64. Again, this governor may not work on older ThinkPads like the T21.
powersave
Sets the Frequency to the lowest available, to save power. Not the best choice for battery lifetime on Intel CPU.
userspace
Allows you to set the frequency manually, unlike the others. Some frequency scaling daemons require this governor to operate correctly. This is typically the recommended option with older processors like A30p's pIIIm-1200.
performance
Sets your Frequency always to the highest available.

Now we set our governor: What is our current governor?

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
userspace

Set new governor and watch if it has changed

# echo conservative > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
conservative

Congrats! Your governor is active.

You may set the governor in your rc.local, to make it used on every boot.

Using Frequency Scaling Daemons

Frequency Scaling Daemons adapt the frequency policy to different situations. A typical configuration would be to use the ondemand governor running off batteries and performance otherwise, or combining powersave with conservative on laptops with heat problems. More sophisticated setups adapt to battery level, CPU temperature or even running programs. Some daemons are able to control other power management features like hard disks or graphic cards.

NOTE!
Daemons are optional. If you don't plan to change policies depending on the situation, you don't need one and you can stick to the ondemand or conservative frequency scaling governors, available in kernels after 2.6.10 or 2.6.12 respectively. See above. They require less configuration and have generally been experienced to flawlessly adapt to the situations at hand.

Some daemons use the kernel governors (see above), others implement the functionality on their own. In the latter case you have to enable the userspace governor. If it is built as module, load it as cpufreq-userspace.

There are plenty of userspace frequency scaling daemons available:

Debian notes

Instead of compiling your own kernel, you can use the Debian "stock" kernel. In Unstable/SID the 2.6.12 kernel image with an /etc/modules file that includes:

battery
ac
thermal
processor
acpi-cpufreq
cpufreq-userspace

With the powernowd package and you should be setup.

Debian has no rc.local, so read this and this.

A better alternative for Debian than modifying bootscripts, is to install the sysfsutils package. Then edit /etc/sysfs.conf (as root), where you can setup values to sysfs entries that you want to be modified automatically on boot.

Troubleshooting

  • If you have a Coppermine-piix-smi based ThinkPads like from the A2x, X2x and T2x series you need to enable the speedstep-smi driver in the kernel and load it if it's built as module. You might want to look at this page.
  • If you have a p4-class celeron based ThinkPad like the R40e you might want to look at this page
  • You may need to set your BIOS to "maximum performance" if you are using Linux to set the CPU speed. This is necessary to prevent odd behaviour (cpufreq 'freezing' at certain frequencies) with the T4x series.

Finetuning voltages and available frequencies

See Pentium M undervolting and underclocking.

A note about CPU throttling

Throttling the CPU through ACPI "T" states is generally useless for power consumption reduction nowadays. It is an artifact of the past, when there was no clock frequency scaling and ACPI "C" states were mostly not implemented or didn't exist.

Throttling does not decrease clock frequency at all, and it can even increase power consumption in a modern CPU capable of ACPI "C" states, as it can interfere with the CPU reaching the higher C states (such as C2).

On a T43, setting a CPU to a ACPI Throttle state different than T0 (no throttling) can cause it to draw more than 100mW extra power, as it will reach C2 less often.

In case your BIOS offers "cpu power management" and "pci bus power management" disabled by default (that's the case in X40 with the 2.08 BIOS), you should turn them on (or choose "automatic"). Despite what the BIOS online documentation says ("rarely needed"), this is quite useful, since it make the deepests (C3 and C4) ACPI C-states avaibles. On a kernel more with dynticks (2.6.21 and over), this should save about 2W or more.

External links

  • The Ondemand Governor, Intel Open Source Technology Center (Venkatesh Pallipadi, Alexey Starikovskiy, Len Brown), presentation at Ottawa Linux Symposium, July 19 2006