Script for Undervolt Stress Testing
This script helps in calibrating voltages when undervolting a Pentium M processor.
People have many different tolerances for how far they will undervolt their system. Some are eager to just run their Pentium-Ms at 700mV and abandon safety; they ramp their systems as far as they can without crashing their system, and maybe they pull the voltages up a margin from the failure point. However, this provides only a weak degree of security as a number of failures can occur that might not surface immediately. In the worst case, the system will fail months later, and the blame might be assigned to, say, a kernel upgrade or patch when really the system failed due to intermittent lack of power.
Many would like to guard themselves again such a failure and consequently have opted to run a prime number stress test such as MPrime in a "torture test" mode, while they ramp down their voltages to find a comfortable margin from the failure point. However, as per recommendations from a thread of the Linux-Thinkpad mailing list, perhaps even more can be done. Following such advice, this script not only runs MPrime, but also toggles on and off a lot of power-demanding features of the laptop throughout the course of the test. The idea is to more rapidly expose corner cases in which the system might act up.
This page contains a large amount of code. The actual code should be moved to a dedicated code article, to make easier to download and edit.
#!/bin/bash # # DESCRIPTION AND MOTIVATION # -------------------------- # Designed for an undervolted laptops with frequency stepping, this script # swings the system between aggressive and low power use, and also swings # among the available frequencies. # # The idea is that such exteme use of the system will likely explore corner # cases where the system might fail. Hopefully, such testing can curtail the # time necessary to establish confidence in undervolted systems. # # In the background the MPrime program, a prime number search engine, runs in a # "torture test" mode, in which it tests computations against known results and # errs out if there's a discrepancy. Unless it errs out, this script runs # forever. # # IMPLEMENTATION # -------------- # The design of this script attempts to address laptops beyond the Thinkpad T42 # for which it was designed. Many of the function definitions are prepended # with conditionals that check the system for functionality and either bail out # or disable features accordingly. # # In particular, the nature of what "aggressive" constitutes is defined by a # number of "toggle_" functions. The pre-pended conditional to these functions # appends the function name to $AGGRESSIVE_TOGGLES if the system appears to # support the feature. The toggle_aggression function then calls all the # functions in $AGGRESSIVE_TOGGLES. Look at these "toggle_" functions for # examples of how to extend this script for other possible stressing. # # EXTERNAL PROGRAMS EMPLOYED # -------------------------- # Test system integriy (required): MPrime - http://www.mersenne.org/prime.htm # Download files: curl - http://curl.haxx.se # Read random sectors from CD: spew (for gorge) - http://spew.berlios.de # Keep hard disk active: stress - http://weather.ou.edu/~apw/projects/stress/ # # EXECUTION # --------- # Read this script including all the warnings below, and then make sure all the # variables in the "Script Globals" section are appropriately set. # # This script uses the mprime binary with the "-t" switch for the MPrime # "torture test." This test by default uses all the memory available on the # system. However, if you run this system for many hours, your kernel may run # out of memory, and kill mprime and this script. To spare yourself this # problem, use the "NightMemory=" and "DayMemory=" parameters in MPrime's # local.ini file, a file typically in the same directory as the mprime # executable (read the MPrime documentation for specifics). The torture test # by default uses the greater of these two settings, so just set them both a # reasonable margin away from the total amount of memory available on your # system. On a system with 512MB of RAM, I set these parameters both to 448, # and had enough memory left over to run my normal set of background processes. # # The arguments of this script are "aggression" toggles to disable. Any # function below that begins with "toggle_$OPTION" can be disabled by using # $OPTION as one of the arguments of this script. Otherwise, all the stressing # that a system supports are enabled by default. # # Because of Warning 3 below, I recommend you run this script as # # stress_test 2>