Guest User

Supermicro X9 PID fan control script

a guest
Mar 31st, 2019
1,666
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
  1. #!/usr/local/bin/perl
  2.  
  3. # This script is based on the hybrid fan controller script created by @Stux, and posted at:
  4. # https://forums.freenas.org/index.php?threads/script-hybrid-cpu-hd-fan-zone-controller.46159/
  5.  
  6. # The significant changes from @Stux's script are:
  7. # 1. Replace HD fan control of several discrete duty cycles as a function of hottest HD temperature with a PID controller
  8. #    which controls duty cycle in 1% steps as a function of average HD temperature.  As a protection, if any HD temperature
  9. #    exceeds a specified value, the HD fans are commanded to 100% duty cycle.  This covers cases where one HD may be running
  10. #    hot, even if the average HD temperature is acceptable, or the PID loop control has gone awry.
  11. # 2. Add optional setting to command CPU fans to 100% duty cycle if needed to assist with HD cooling, to cover scenarios
  12. #    where the CPU fan zone also controls chassis exit fans.
  13. # 3. Add optional log of HD fan temperatures, PID loop values and commanded fan duty cycles.  The log may optionally contain
  14. #    a record of each HD temperature, or only the coolest and warmest HD temperatures.
  15.  
  16. # This script can be downloaded from :
  17. # https://forums.freenas.org/index.php?threads/pid-fan-controller-perl-script.50908/
  18.  
  19. ###############################################################################################
  20. # This script is designed to control both the CPU and HD fans in a Supermicro X10 based system according to both
  21. # the CPU and HD temperatures in order to minimize noise while providing sufficient cooling to deal with scrubs
  22. # and CPU torture tests. It may work in X9 based system, but this has not been tested.
  23.  
  24. # It relies on you having two fan zones.
  25.  
  26. # To use this correctly, you should connect all your PWM HD fans, by splitters if necessary to the FANA header.
  27. # CPU, case and exhaust fans should then be connected to the numbered (ie CPU based) headers.  This script will then control the
  28. # HD fans in response to the HD temp, and the other fans in response to CPU temperature. When CPU temperature is high the HD fans.
  29. # will be used to provide additional cooling, if you specify cpu/hd shared cooling.
  30.  
  31. # If the fans should be high, and they are stuck low, or vice-versa, the BMC will be rebooted, thus it is critical to set the
  32. # cpu/hd_max_fan_speed variables correctly.
  33.  
  34. # NOTE: It is highly likely the "get_hd_temp" function will not work as-is with your HDs. Until a better solution is provided
  35. # you will need to modify this function to properly acquire the temperature. Setting debug=2 will help.
  36.  
  37. # Tested with a SuperMicro X10SRH-cF, Xeon E5-1650v4, Noctua 120, 90 and 80mm fans in a Norco RPC-4224 4U chassis, with 16 x 4 TB WD Red drives.
  38.  
  39. # More information on CPU/Peripheral Zone can be found in this post:
  40. # https://forums.freenas.org/index.php?threads/thermal-and-accoustical-design-validation.28364/
  41.  
  42. # stux
  43.  
  44. ###############################################################################################
  45. # VERSION HISTORY
  46. #####################
  47. # 2016-09-19 Initial Version
  48. # 2016-09-19 Added cpu_hd_override_temp, to prevent HD fans cycling when CPU fans are sufficient for cooling CPU
  49. # 2016-09-26 hd_list is now refreshed before checking HD temps so that we start/stop monitoring devices that
  50. #            have been hot inserted/removed.
  51. #            "Drives are warm, going to 75%" log message was missing an unless clause causing it to print
  52. #            every time
  53. # 2016-10-07 Replaced get_cpu_temp() function with get_cpu_temp_sysctl() which queries the kernel, instead of
  54. #            IPMI. This is faster, more accurate and more compatible, hopefully allowing this to work on X9
  55. #            systems. The original function is still present and is now called get_cpu_temp_ipmi().
  56. #            Because this is a much faster method of reading the temps,  and because its actually the max core
  57. #            temp, I found that the previous cpu_hd_override_temp of 60 was too sensitive and caused the override
  58. #            too often. I've bumped it up to 62, which on my system seems good. This means that if a core gets to
  59. #            62C the HD fans will kick in, and this will generally bring temps back down to around 60C... depending
  60. #            on the actual load. Your results will vary, and for best results you should tune controller with
  61. #            mprime testing at various thread levels. Updated the cpu threasholds to 35/45/55 because of the improved
  62. #            responsiveness of the get_cpu_temp function
  63. #
  64. # The following changes are by Kevin Horton
  65. # 2017-01-14 Reworked get_hd_list() to exclude SSDs
  66. #            Added function to calculate maximum and average HD temperatures.
  67. #            Replaced original HD fan control scheme with a PID controller, controlling the average HD temp..
  68. #            Added safety override if any HD reaches a specified max temperature.  If so, the PID loop is overridden,
  69. #            and HD fans are set to 100%
  70. #            Retain float value of fan duty cycle between loop cycles, so that small duty cycle corrections
  71. #            accumulate and eventually push the duty cycle to the next integer value.
  72. # 2017-01-18 Added log file
  73. # 2017-01-21 Refactored code to bump up CPU fan to help cool HD.  Drop the variabe CPU duty cycle, and just set to High,
  74. #            Added log file option without temps for every HD.
  75. # 2017-01-29 Add header to log file every X hours
  76.  
  77. # TO DO
  78. #           Add option for no CPU fan control.
  79. ###############################################################################################
  80. ## CONFIGURATION
  81. ################
  82.  
  83. ## DEBUG LEVEL
  84. ## 0 means no debugging. 1,2,3,4 provide more verbosity
  85. ## You should run this script in at least level 1 to verify its working correctly on your system
  86. $debug = 0;
  87.  
  88. ## LOG
  89. $log = '/var/log/ipmi_fan_control/IPMIFanControl.log'; #Ubuntu mod - I personally added my home directory to make it easy to spot check temps, so /home/<username>/IPMIFanControl.log instead
  90. $log_temp_summary_only = 1;      # 1 if not logging individual HD temperatures.  0 if logging temp of each HD
  91. $log_header_hourly_interval = 2; # number of hours between log headers.  Valid options are 1, 2, 3, 4, 6 & 12.
  92.                                  # log headers will always appear at the start of a log, at midnight and any
  93.                                  # time the list of HDs changes (if individual HD temperatures are logged)
  94.  
  95. ## CPU THRESHOLD TEMPS
  96. ## A modern CPU can heat up from 35C to 60C in a second or two. The fan duty cycle is set based on this
  97. $high_cpu_temp = 65;       # will go HIGH when we hit
  98. $med_cpu_temp = 55;        # will go MEDIUM when we hit, or drop below again
  99. $low_cpu_temp = 45;        # will go LOW when we fall below 35 again
  100.  
  101. ## HD THRESHOLD TEMPS
  102. ## HD change temperature slowly.
  103. ## This is the temperature that we regard as being uncomfortable. The higher this is the
  104. ## more silent your system.
  105. ## Note, it is possible for your HDs to go above this... but if your cooling is good, they shouldn't.
  106. $hd_ave_target = 41;    # PID control loop will target this average temperature
  107. $hd_max_allowed_temp = 45; # celsius. PID control aborts and fans set to 100% duty cycle when a HD hits this temp.
  108.                            # This ensures that no matter how poorly chosen the PID gains are, or how much of a spread
  109.                            # there is between the average HD temperature and the maximum HD temperature, the HD fans
  110.                            # will be set to 100% if any drive reaches this temperature.
  111.  
  112. ## CPU TEMP TO OVERRIDE HD FANS
  113. ## when the CPU climbs above this temperature, the HD fans will be overridden
  114. ## this prevents the HD fans from spinning up when the CPU fans are capable of providing
  115. ## sufficient cooling.
  116. $cpu_hd_override_temp = 70;
  117.  
  118. ## CPU/HD SHARED COOLING
  119. ## If your HD fans contribute to the cooling of your CPU you should set this value.
  120. ## It will mean when you CPU heats up your HD fans will be turned up to help cool the
  121. ## case/cpu. This would only not apply if your HDs and fans are in a separate thermal compartment.
  122. $hd_fans_cool_cpu = 1;      # 1 if the hd fans should spin up to cool the cpu, 0 otherwise
  123.  
  124. ## HD FAN DUTY CYCLE TO OVERRIDE CPU FANS
  125. $cpu_fans_cool_hd            = 1;  # 1 if the CPU fans should spin up to cool the HDs, when needed.  0 otherwise.  This may be useful if
  126.                                    #   the CPU fan zone also contains chassis exit fans, as an increase in chassis exit fan speed may
  127.                                    #   increase the HD cooling air flow.
  128. $hd_cpu_override_duty_cycle = 242;  # when the HD duty cycle equals or exceeds this value, the CPU fans may be overridden to help cool HDs
  129.  
  130. ## CPU TEMP CONTROL
  131. $cpu_temp_control = 1;  # 1 if the script will control a CPU fan to control CPU temperatures.  0 if the script only controls HD fans.
  132.  
  133. ## PID CONTROL GAINS
  134. ## If you were using the spinpid.sh PID control script published by @Glorious1 at the link below, you will need to adjust the value of $Kp
  135. ## that you were using, as that script defined Kp in terms of the gain per one cycle around the loop, but this script defines it in terms
  136. ## of the gain per minute.  Divide the Kp value from the spinpid.sh script by the time in minutes for checking hard drive temperatures.
  137. ## For example, if you used a gain of Kp = 8, and a T = 3 (3 minute interval), the new value is $Kp = 8/3.
  138. ## Kd values from the spinpid.sh script can be used directly here.
  139. ## https://forums.freenas.org/index.php?threads/script-to-control-fan-speed-in-response-to-hard-drive-temperatures.41294/page-4#post-285668
  140. $Kp = 10/3; # 3.333
  141. $Ki = 0;
  142. $Kd = 120;
  143.  
  144.  
  145. #######################
  146. ## FAN CONFIGURATION
  147. ####################
  148.  
  149. ## FAN SPEEDS
  150. ## You need to determine the actual max fan speeds that are achieved by the fans
  151. ## Connected to the cpu_fan_header and the hd_fan_header.
  152. ## These values are used to verify high/low fan speeds and trigger a BMC reset if necessary.
  153. $cpu_max_fan_speed    = 2500;
  154. $hd_max_fan_speed     = 2500;
  155.  
  156.  
  157. ## CPU FAN DUTY LEVELS
  158. ## These levels are used to control the CPU fans
  159. $fan_duty_high    = 255;        # percentage on, ie 100% is full speed.
  160. $fan_duty_med     = 153;        # 60%
  161. $fan_duty_low     = 102;        # 40%
  162.  
  163. ## HD FAN DUTY LEVELS
  164. ## These levels are used to control the HD fans
  165. $hd_fan_duty_high      = 255;    # percentage on, ie 100% is full speed.
  166. $hd_fan_duty_med_high  = 204;    # 80%
  167. $hd_fan_duty_med_low   = 127;    # 50%
  168. $hd_fan_duty_low       = 102;    # 30%, some 120mm fans stall below 30.
  169. $hd_fan_duty_start     = 165;    # 65%, HD fan duty cycle when script starts.
  170.  
  171.  
  172. ## FAN ZONES
  173. # Your CPU/case fans should probably be connected to the main fan sockets, which are in fan zone zero
  174. # Your HD fans should be connected to FANA which is in Zone 1
  175. # You could switch the CPU/HD fans around, as long as you change the zones and fan header configurations.
  176. #
  177. # 0 = FAN1..5
  178. # 1 = FANA
  179. $cpu_fan_zone = 17;
  180. $hd_fan_zone  = 16;
  181.  
  182.  
  183. ## FAN HEADERS
  184. ## these are the fan headers which are used to verify the fan zone is high. FAN1+ are all in Zone 0, FANA is Zone 1.
  185. ## cpu_fan_header should be in the cpu_fan_zone
  186. ## hd_fan_header should be in the hd_fan_zone
  187. $cpu_fan_header = "FANA";                 # used for printing to standard output for debugging  
  188. $hd_fan_header  = "FAN1";                 # used for printing to standard output for debugging  
  189. @hd_fan_list = ("FANA", "FANB", "FANC");  # used for logging to file  
  190.  
  191. ################
  192. ## MISC
  193. #######
  194.  
  195. ## IPMITOOL PATH
  196. ## The script needs to know where ipmitool is
  197. $ipmitool = "/usr/bin/ipmitool";
  198.  
  199. ## HD POLLING INTERVAL
  200. ## The controller will only poll the harddrives periodically. Since hard drives change temperature slowly
  201. ## this is a good thing. 180 seconds is a good value.
  202. $hd_polling_interval = 180;    # seconds
  203.  
  204. ## FAN SPEED CHANGE DELAY TIME
  205. ## It takes the fans a few seconds to change speeds, we allow a grace before verifying. If we fail the verify
  206. ## we'll reset the BMC
  207. $fan_speed_change_delay = 10; # seconds
  208.  
  209. ## BMC REBOOT TIME
  210. ## It takes the BMC a number of seconds to reset and start providing sensible output. We'll only
  211. ## Reset the BMC if its still providing rubbish after this time.
  212. $bmc_reboot_grace_time = 120; # seconds
  213.  
  214. ## BMC RETRIES BEFORE REBOOTING
  215. ## We verify high/low of fans, and if they're not where they should be we reboot the BMC after so many failures
  216. $bmc_fail_threshold    = 1;     # will retry n times before rebooting
  217.  
  218. # edit nothing below this line
  219. ########################################################################################################################
  220.  
  221.  
  222. # GLOBALS
  223. @hd_list = ("sda", "sdb", "sdc", "sdd", "sde", "sdf");
  224.  
  225. # massage fan speeds
  226. $cpu_max_fan_speed *= 0.99;
  227. $hd_max_fan_speed *= 0.99;
  228.  
  229. $hd_duty = $hd_fan_duty_start;
  230.  
  231. #fan/bmc verification globals/timers
  232. $last_fan_level_change_time = 0;        # the time when we changed a fan level last
  233. $fan_unreadable_time        = 0;        # the time when a fan read failure started, 0 if there is none.
  234. $bmc_fail_count             = 0;        # how many times the fans failed verification in the last period.
  235.  
  236. #this is the last cpu temp that was read        
  237. $last_cpu_temp = 0;
  238.  
  239. use POSIX qw(strftime);
  240. use Time::Local;
  241.  
  242. # start the controller
  243. main();
  244.  
  245. ################################################ MAIN
  246.  
  247. sub main
  248. {
  249.     open LOG, ">>", $log or die $!;
  250.    
  251.     # Print Log Header
  252.     print_log_header(@hd_list);
  253.     # current time
  254.     ($sec,$min,$hour,$day,$month,$year,$wday,$yday,$isdst) = localtime(time);
  255.     $next_log_hour = ( int( $hour/$log_header_hourly_interval ) + 1 ) * $log_header_hourly_interval;
  256.    
  257.     if ( $next_log_hour >= 24 )
  258.     {
  259.         # next log time is after midnight.  Roll back to previous log time, calcuate Unix epoch seconds, and add required seconds to get next log time
  260.         $next_log_hour -= $log_header_hourly_interval;
  261.         $next_log_time = timelocal(0,0,$next_log_hour,$day,$month,$year) + 3600 * $log_header_hourly_interval;
  262.     }
  263.     else
  264.     {
  265.         # next log time in seconds past Unix epoch
  266.         $next_log_time = timelocal(0,0,$next_log_hour,$day,$month,$year);
  267.     }
  268.    
  269.     # need to go to Full mode so we have unfettered control of Fans
  270.     set_fan_mode("full");
  271.    
  272.     my $cpu_fan_level = "";
  273.     my $old_cpu_fan_level = "";
  274.     my $override_hd_fan_level = 0;
  275.     my $last_hd_check_time = 0;
  276.     $temp_error = 0;
  277.     my $integral = 0;
  278.     $cpu_fan_override = 0;
  279.     $hd_fan_duty = $hd_fan_duty_start;
  280.  
  281.     ($hd_min_temp, $hd_max_temp, $hd_ave_temp_old, @hd_temps) = get_hd_temps();
  282.    
  283.     while()
  284.     {
  285.         if ($cpu_temp_control)
  286.         {
  287.             $old_cpu_fan_level = $cpu_fan_level;
  288.             $cpu_fan_level = control_cpu_fan( $old_cpu_fan_level );
  289.        
  290.             if( $old_cpu_fan_level ne $cpu_fan_level )
  291.             {
  292.                 $last_fan_level_change_time = time;
  293.             }
  294.  
  295.             if( $cpu_fan_level eq "high" )
  296.             {
  297.            
  298.                 if( $hd_fans_cool_cpu && !$override_hd_fan_level && ($last_cpu_temp >= $cpu_hd_override_temp || $last_cpu_temp == 0) )
  299.                 {
  300.                     #override hd fan zone level, once we override we won't backoff until the cpu drops to below "high"
  301.                     $override_hd_fan_level = 1;
  302.                     dprint( 0, "CPU Temp: $last_cpu_temp >= $cpu_hd_override_temp, Overiding HD fan zone to $hd_fan_duty_high%, \n" );
  303.                     set_fan_zone_duty_cycle( $hd_fan_zone, $hd_fan_duty_high );
  304.                
  305.                     $last_fan_level_change_time = time;
  306.                 }
  307.             }
  308.             elsif( $override_hd_fan_level )
  309.             {
  310.                 #restore hd fan zone level;
  311.                 $override_hd_fan_level = 0;
  312.                 dprint( 0, "Restoring HD fan zone to $hd_fan_duty%\n" );
  313.                 set_fan_zone_duty_cycle( $hd_fan_zone, $hd_fan_duty );    
  314.            
  315.                 $last_fan_level_change_time = time;
  316.             }
  317.         }
  318.  
  319.         # periodically determine hd fan zone level
  320.        
  321.         my $check_time = time;
  322.         if( $check_time - $last_hd_check_time >= $hd_polling_interval )
  323.         {
  324.             $last_hd_check_time = $check_time;
  325.             @last_hd_list = @hd_list;
  326.    
  327.             ($hd_min_temp, $hd_max_temp, $hd_ave_temp, @hd_temps) = get_hd_temps();
  328.             $hd_fan_duty_old = $hd_fan_duty;
  329.             $hd_fan_duty = calculate_hd_fan_duty_cycle_PID( $hd_max_temp, $hd_ave_temp, $hd_fan_duty );
  330.            
  331.             if( !$override_hd_fan_level )
  332.             {
  333.                 set_fan_zone_duty_cycle( $hd_fan_zone, $hd_fan_duty );
  334.  
  335.                 $last_fan_level_change_time = time; # this resets every time, but it shouldn't matter since hd_polling_interval is large.
  336.             }
  337.             # print to log
  338.             if (@hd_list != @last_hd_list && $log_temp_summary_only == 0)
  339.             {
  340.                 # print new disk iD header if it has changed (e.g. hot swap insert or remove)
  341.                 @hd_list = print_log_header(@hd_list);
  342.             }
  343.             elsif ( $check_time >= $next_log_time )
  344.             {
  345.                 # time to print a new log header
  346.                 @hd_list = print_log_header(@hd_list);
  347.                 $next_log_time += 3600 * $log_header_hourly_interval;
  348.             }
  349.            
  350.             my $timestring = build_time_string();
  351.             # ($hd_min_temp, $hd_max_temp, $hd_ave_temp, @hd_temps) = get_hd_temps();
  352.            
  353.             print LOG "$timestring";
  354.            
  355.             if ($log_temp_summary_only)
  356.             {
  357.                 printf(LOG "    %2i", 0+@hd_list); # number of HDs, so it can be seen if a hot swap addition or removal was detected
  358.                 printf(LOG "   %2i", $hd_min_temp);
  359.             }
  360.             else
  361.             {
  362.                 foreach my $item (@hd_temps)
  363.                 {
  364.                     printf(LOG "%5s", $item);
  365.                 }
  366.             }
  367.             printf(LOG "  ^%2i", $hd_max_temp);
  368.             printf(LOG "%7.2f", $hd_ave_temp);
  369.             printf(LOG "%6.2f", $hd_ave_temp - $hd_ave_target);
  370.            
  371.             $hd_fan_mode = get_fan_mode();
  372.             printf(LOG "%6s", $hd_fan_mode);
  373.             $ave_fan_speed = get_fan_ave_speed(@hd_fan_list);
  374.             printf(LOG "%6s", $ave_fan_speed);
  375.             printf(LOG "%4i/%-3i", $hd_fan_duty_old, $hd_fan_duty);
  376.            
  377.             $cput = get_cpu_temp_sysctl();
  378.             printf(LOG "%4i %6.2f %6.2f  %6.2f  %6.2f%\n", $cput, $P, $I, $D, $hd_duty);
  379.         }
  380.        
  381.         # verify_fan_speed_levels function is fairly complicated
  382.         if ($cpu_temp_control)
  383.         {
  384.             verify_fan_speed_levels(  $cpu_fan_level, $override_hd_fan_level ? $hd_fan_duty_high : $hd_fan_duty );
  385.         }
  386.         else
  387.         {
  388.             verify_fan_speed_levels2( $hd_fan_duty );
  389.         }
  390.        
  391. #         if ($cpu_temp_control)
  392. #         {
  393. #             # CPU temps can go from cool to hot in 2 seconds! so we only ever sleep for 1 second.
  394. #             sleep 1;
  395. #         }
  396. #         else
  397. #         {
  398. #             sleep $hd_polling_interval - 1;
  399. #         }
  400.         # CPU temps can go from cool to hot in 2 seconds! so we only ever sleep for 1 second.
  401.         sleep 1;
  402.  
  403.     } # inf loop
  404. }
  405.  
  406.  
  407. ################################################# SUBS
  408.  
  409. sub get_hd_temps
  410. # return minimum, maximum, average HD temperatures and array of individual temps
  411. {
  412.     my $max_temp = 0;
  413.     my $min_temp = 1000;
  414.     my $temp_sum = 0;
  415.     my $HD_count = 0;
  416.     my @temp_list = ();
  417.  
  418.     foreach my $item (@hd_list)
  419.     {
  420.         my $disk_dev = "/dev/disk/by-id/$item";
  421.         my $command = "/usr/sbin/hddtemp -n $disk_dev";
  422.  
  423.         my $temp = `$command`;
  424.         chomp $temp;
  425.  
  426.         if( $temp )
  427.         {
  428.             push(@temp_list, $temp);
  429.             $temp_sum += $temp;
  430.             $HD_count +=1;
  431.             $max_temp = $temp if $temp > $max_temp;
  432.             $min_temp = $temp if $temp < $min_temp;
  433.         }
  434.     }
  435.  
  436.     my $ave_temp = $temp_sum / $HD_count;
  437.  
  438.     return ($min_temp, $max_temp, $ave_temp, @temp_list);
  439. }
  440.  
  441. sub verify_fan_speed_levels
  442. {
  443.     my( $cpu_fan_level, $hd_fan_duty ) = @_;
  444.     dprint( 4, "verify_fan_speed_levels: cpu_fan_level: $cpu_fan_level, hd_fan_duty: $hd_fan_duty\n");
  445.  
  446.     my $extra_delay_before_next_check = 0;
  447.    
  448.     my $temp_time = time - $last_fan_level_change_time;
  449.     dprint( 4, "Time since last verify : $temp_time, last change: $last_fan_level_change_time, delay: $fan_speed_change_delay\n");
  450.     if( $temp_time > $fan_speed_change_delay )
  451.     {
  452.         # we've waited for the speed change to take effect.
  453.        
  454.         my $cpu_fan_speed = get_fan_speed("CPU");
  455.         if( $cpu_fan_speed < 0 )
  456.         {
  457.             dprint(1,"CPU Fan speed unavailable\n" );
  458.             $fan_unreadable_time = time if $fan_unreadable_time == 0;
  459.         }
  460.        
  461.         my $hd_fan_speed = get_fan_speed("HD");
  462.         if( $hd_fan_speed < 0 )
  463.         {
  464.             dprint(1,"HD Fan speed unavailable\n" );
  465.             $fan_unreadable_time = time if $fan_unreadable_time == 0;
  466.         }
  467.        
  468.         if( $hd_fan_speed < 0 || $cpu_fan_speed < 0 )
  469.         {
  470.             # one of the fans couldn't be reliably read
  471.  
  472.             my $temp_time = time - $fan_unreadable_time;
  473.             if( $temp_time > $bmc_reboot_grace_time )
  474.             {
  475.                 #we've waited, and we still can't read fan speed.
  476.                 dprint(0, "Fan speeds are unreadable after $bmc_reboot_grace_time seconds, rebooting BMC\n");
  477.                 reset_bmc();
  478.                 $fan_unreadable_time = 0;
  479.             }
  480.             else
  481.             {
  482.                 dprint(2, "Fan speeds are unreadable after $temp_time seconds, will try again\n");  
  483.             }      
  484.         }
  485.         else
  486.         {
  487.             # we have no been able to read the fan speeds
  488.  
  489.             my $cpu_fan_is_wrong = 0;
  490.             my $hd_fan_is_wrong = 0;    
  491.            
  492.             #verify cpu fans
  493.             if( $cpu_fan_level eq "high" && $cpu_fan_speed < $cpu_max_fan_speed )
  494.             {
  495.                 dprint(0, "CPU fan speed should be high, but $cpu_fan_speed < $cpu_max_fan_speed.\n");
  496.                 $cpu_fan_is_wrong=1;
  497.             }
  498.             elsif( $cpu_fan_level eq "low" && $cpu_fan_speed > $cpu_max_fan_speed )
  499.             {
  500.                 dprint(0, "CPU fan speed should be low, but $cpu_fan_speed > $cpu_max_fan_speed.\n");
  501.                 $cpu_fan_is_wrong=1;
  502.             }
  503.            
  504.             #verify hd fans
  505.             if( $hd_fan_duty >= $hd_fan_duty_high && $hd_fan_speed < $hd_max_fan_speed )
  506.             {
  507.                 dprint(0, "HD fan speed should be high, but $hd_fan_speed < $hd_max_fan_speed.\n");
  508.                 $hd_fan_is_wrong=1;
  509.             }
  510.             elsif( $hd_fan_duty <= $hd_fan_duty_low && $hd_fan_speed > $hd_max_fan_speed )
  511.             {
  512.                 dprint(0, "HD fan speed should be low, but $hd_fan_speed > $hd_max_fan_speed.\n");
  513.                 $hd_fan_is_wrong=1;
  514.             }
  515.            
  516.             #verify both fans are good
  517.             if( $cpu_fan_is_wrong || $hd_fan_is_wrong )
  518.             {
  519.                 $bmc_fail_count++;
  520.                
  521.                 dprint( 3, "bmc_fail_count:  $bmc_fail_count, bmc_fail_threshold: $bmc_fail_threshold\n");
  522.                 if( $bmc_fail_count <= $bmc_fail_threshold )
  523.                 {
  524.                     #we'll try setting the fan speeds, and giving it another attempt
  525.                     dprint(1, "Fan speeds are not where they should be, will try again.\n");
  526.  
  527.                     set_fan_mode("full");
  528.  
  529.                     set_fan_zone_level( $cpu_fan_zone, $cpu_fan_level );
  530.                     set_fan_zone_duty_cycle( $hd_fan_zone, $hd_fan_duty );
  531.                 }
  532.                 else
  533.                 {
  534.                     #time to reset the bmc
  535.                     dprint(1, "Fan speeds are still not where they should be after $bmc_fail_count attempts, will reboot BMC.\n");
  536.                     set_fan_mode("full");
  537.                     reset_bmc();
  538.                     $bmc_fail_count = 0;
  539.                 }
  540.             }
  541.             else
  542.             {
  543.                 #everything is good. We'll sit back for another minute.
  544.  
  545.                 dprint( 2, "Verified fan levels, CPU: $cpu_fan_speed, HD: $hd_fan_speed. All good.\n" );
  546.                 $bmc_fail_count = 0; # we succeeded
  547.  
  548.                 $extra_delay_before_next_check = 60 - $fan_speed_change_delay; # lets give it a minute since it was good.
  549.             }  
  550.  
  551.                
  552.             #reset our unreadable timer, since we read the fan speeds.
  553.             $fan_unreadable_time = 0;
  554.                                    
  555.         }
  556.            
  557.         #reset our timer, so that we'll wait before checking again.
  558.         $last_fan_level_change_time = time + $extra_delay_before_next_check; #another delay before checking please.
  559.     }
  560.    
  561.     return;
  562. }
  563.  
  564. sub verify_fan_speed_levels2
  565. {
  566.     my( $hd_fan_duty ) = @_;
  567.     dprint( 4, "verify_fan_speed_level: hd_fan_duty: $hd_fan_duty\n");
  568.  
  569.     my $extra_delay_before_next_check = 0;
  570.    
  571.     my $temp_time = time - $last_fan_level_change_time;
  572.     dprint( 4, "Time since last verify : $temp_time, last change: $last_fan_level_change_time, delay: $fan_speed_change_delay\n");
  573.     if( $temp_time > $fan_speed_change_delay )
  574.     {
  575.         # we've waited for the speed change to take effect.
  576.        
  577.         my $hd_fan_speed = get_fan_speed("HD");
  578.         if( $hd_fan_speed < 0 )
  579.         {
  580.             dprint(1,"HD Fan speed unavailable\n" );
  581.             $fan_unreadable_time = time if $fan_unreadable_time == 0;
  582.         }
  583.        
  584.         if( $hd_fan_speed < 0 )
  585.         {
  586.             # one of the fans couldn't be reliably read
  587.  
  588.             my $temp_time = time - $fan_unreadable_time;
  589.             if( $temp_time > $bmc_reboot_grace_time )
  590.             {
  591.                 #we've waited, and we still can't read fan speed.
  592.                 dprint(0, "Fan speeds are unreadable after $bmc_reboot_grace_time seconds, rebooting BMC\n");
  593.                 reset_bmc();
  594.                 $fan_unreadable_time = 0;
  595.             }
  596.             else
  597.             {
  598.                 dprint(2, "Fan speeds are unreadable after $temp_time seconds, will try again\n");  
  599.             }      
  600.         }
  601.         else
  602.         {
  603.             # we have no been able to read the fan speeds
  604.  
  605.             my $hd_fan_is_wrong = 0;    
  606.            
  607.             #verify hd fans
  608.             if( $hd_fan_duty >= $hd_fan_duty_high && $hd_fan_speed < $hd_max_fan_speed )
  609.             {
  610.                 dprint(0, "HD fan speed should be high, but $hd_fan_speed < $hd_max_fan_speed.\n");
  611.                 $hd_fan_is_wrong=1;
  612.             }
  613.             elsif( $hd_fan_duty <= $hd_fan_duty_low && $hd_fan_speed > $hd_max_fan_speed )
  614.             {
  615.                 dprint(0, "HD fan speed should be low, but $hd_fan_speed > $hd_max_fan_speed.\n");
  616.                 $hd_fan_is_wrong=1;
  617.             }
  618.            
  619.             #verify HD fans are good
  620.             if( $hd_fan_is_wrong )
  621.             {
  622.                 $bmc_fail_count++;
  623.                
  624.                 dprint( 3, "bmc_fail_count:  $bmc_fail_count, bmc_fail_threshold: $bmc_fail_threshold\n");
  625.                 if( $bmc_fail_count <= $bmc_fail_threshold )
  626.                 {
  627.                     #we'll try setting the fan speeds, and giving it another attempt
  628.                     dprint(1, "Fan speeds are not where they should be, will try again.\n");
  629.  
  630.                     set_fan_mode("full");
  631.  
  632.                     set_fan_zone_duty_cycle( $hd_fan_zone, $hd_fan_duty );
  633.                 }
  634.                 else
  635.                 {
  636.                     #time to reset the bmc
  637.                     dprint(1, "Fan speeds are still not where they should be after $bmc_fail_count attempts, will reboot BMC.\n");
  638.                     set_fan_mode("full");
  639.                     reset_bmc();
  640.                     $bmc_fail_count = 0;
  641.                 }
  642.             }
  643.             else
  644.             {
  645.                 #everything is good. We'll sit back for another minute.
  646.  
  647.                 dprint( 2, "Verified fan levels, HD: $hd_fan_speed. All good.\n" );
  648.                 $bmc_fail_count = 0; # we succeeded
  649.  
  650.                 $extra_delay_before_next_check = 60 - $fan_speed_change_delay; # lets give it a minute since it was good.
  651.             }  
  652.  
  653.                
  654.             #reset our unreadable timer, since we read the fan speeds.
  655.             $fan_unreadable_time = 0;
  656.                                    
  657.         }
  658.            
  659.         #reset our timer, so that we'll wait before checking again.
  660.         $last_fan_level_change_time = time + $extra_delay_before_next_check; #another delay before checking please.
  661.     }
  662.    
  663.     return;
  664. }
  665.  
  666. # need to pass in last $cpu_fan
  667. sub control_cpu_fan
  668. {
  669.     my ($old_cpu_fan_level) = @_;
  670.  
  671. #    my $cpu_temp = get_cpu_temp_ipmi();    # no longer used, because sysctl is better, and more compatible.
  672.     my $cpu_temp = get_cpu_temp_sysctl();
  673.  
  674.     my $cpu_fan_level = decide_cpu_fan_level( $cpu_temp, $old_cpu_fan_level );
  675.  
  676.     if( $old_cpu_fan_level ne $cpu_fan_level )
  677.     {
  678.             dprint( 1, "CPU Fan changing... ($cpu_fan_level)\n");
  679.         set_fan_zone_level( $cpu_fan_zone, $cpu_fan_level );    
  680.     }
  681.  
  682.     return $cpu_fan_level;
  683. }
  684.  
  685. sub calculate_hd_fan_duty_cycle_PID
  686. {
  687.     my ($hd_max_temp, $hd_ave_temp, $old_hd_duty) = @_;
  688.     # my $hd_duty;
  689.    
  690.     my $temp_error_old = $hd_ave_temp_old - $hd_ave_target;
  691.     my $temp_error = $hd_ave_temp - $hd_ave_target;
  692.  
  693.     if ($hd_max_temp >= $hd_max_allowed_temp  )
  694.     {
  695.         $hd_duty = $hd_fan_duty_high;
  696.         dprint(0, "Drives are too hot, going to $hd_fan_duty_high%\n") unless $old_hd_duty == $hd_duty;
  697.      }
  698.     elsif ($hd_max_temp >= 0 )
  699.     {
  700.         my $temp_error = $hd_ave_temp - $hd_ave_target;
  701.         $integral += $temp_error * $hd_polling_interval / 60;
  702.         my $derivative = ($temp_error - $temp_error_old) * 60 / $hd_polling_interval;
  703.         # my $P = $Kp * $temp_error * $hd_polling_interval / 60;
  704.         # my $I = $Ki * $integral;
  705.         # my $D = $Kd * $derivative;
  706.         $P = $Kp * $temp_error * $hd_polling_interval / 60;
  707.         $I = $Ki * $integral;
  708.         $D = $Kd * $derivative;
  709.         # $hd_duty = $old_hd_duty + $P + $I + $D;
  710.         $hd_duty = $hd_duty + $P + $I + $D;
  711.  
  712.         if ($hd_duty > $hd_fan_duty_high)
  713.         {
  714.             $hd_duty = $hd_fan_duty_high;
  715.         }
  716.         elsif ($hd_duty < $hd_fan_duty_low)
  717.         {
  718.             $hd_duty = $hd_fan_duty_low;
  719.         }
  720.  
  721.         dprint(0, "temperature error = $temp_error\n");
  722.         dprint(1, "PID corrections are P = $P, I = $I and D = $D\n");
  723.         dprint(0, "PID control new duty cycle is $hd_duty%\n") unless $old_hd_duty == $hd_duty;
  724.     }
  725.     else
  726.     {
  727.         $hd_duty = 255;
  728.         dprint( 0, "Drive temperature ($hd_temp) invalid. going to 100%\n");
  729.     }
  730.    
  731.     $hd_ave_temp_old = $hd_ave_temp;
  732.    
  733.     if ($cpu_fans_cool_hd == 1 && $hd_duty > $hd_cpu_override_duty_cycle)
  734.     {
  735.         $cpu_fan_override = 1;
  736.     }
  737.     else
  738.     {
  739.         $cpu_fan_override = 0;
  740.     }
  741.     # $hd_duty is retained as float between cycles, so any small incremental adjustments less
  742.     # than 1 will not be lost, but build up until they are large enough to cause a change
  743.     # after the value is truncated with int()
  744.  
  745.     # add 0.5 before truncating with int() to approximate the behaviour of a proper round() function
  746.     return int($hd_duty + 0.5);
  747. }
  748.  
  749. sub build_date_time_string
  750. {
  751.     my $datetimestring = strftime "%F %H:%M:%S", localtime;
  752.    
  753.     return $datetimestring;
  754. }
  755.  
  756. sub build_date_string
  757. {
  758.     my $datestring = strftime "%F", localtime;
  759.    
  760.     return $datestring;
  761. }
  762.  
  763. sub build_time_string
  764. {
  765.     my $timestring = strftime "%H:%M:%S", localtime;
  766.    
  767.     return $timestring;
  768. }
  769.  
  770. sub print_log_header
  771. {
  772.     @hd_list = @_;
  773.     my $timestring = build_time_string();
  774.     my $datestring = build_date_string();
  775.     printf(LOG "\n\nPID Fan Controller Log  ---  Target HD Temperature = %5.2f deg C  ---  PID Control Gains: Kp = %6.3f, Ki = %6.3f, Kd = %5.1f\n         ", $hd_ave_target, $Kp, $Ki, $Kd);
  776.     if ($log_temp_summary_only)
  777.     {
  778.         print LOG "   HD   Min";
  779.     }
  780.     else
  781.     {
  782.         foreach $item (@hd_list)
  783.         {
  784.             print LOG "     ";
  785.         }
  786.     }
  787.    
  788.     print LOG "  Max   Ave  Temp   Fan   Fan  Fan %   CPU    P      I      D      Fan\n$datestring";
  789.    
  790.     if ($log_temp_summary_only)
  791.     {
  792.         print LOG " Qty  Temp ";
  793.     }
  794.     else
  795.     {
  796.         foreach $item (@hd_list)
  797.         {
  798.             printf(LOG "%4s ", $item);
  799.         }
  800.     }
  801.    
  802.     print LOG "Temp  Temp   Err  Mode   RPM Old/New Temp  Corr   Corr   Corr    Duty\n";
  803.    
  804.     return @hd_list;
  805. }
  806.  
  807. sub get_fan_ave_speed
  808. {
  809.     my $speed_sum = 0;
  810.     my $fan_count = 0;
  811.     foreach my $fan (@_)
  812.     {
  813.         $speed_sum += get_fan_speed2($fan);
  814.         $fan_count += 1;
  815.     }
  816.    
  817.     my $ave_speed = sprintf("%i", $speed_sum / $fan_count);
  818.    
  819.     return $ave_speed;
  820. }
  821.  
  822. sub dprint
  823. {
  824.     my ( $level,$output) = @_;
  825.    
  826. #    print( "dprintf: debug = $debug, level = $level, output = \"$output\"\n" );
  827.    
  828.     if( $debug > $level )
  829.     {
  830.         my $datestring = build_date_time_string();
  831.         print "$datestring: $output";
  832.     }
  833.  
  834.     return;
  835. }
  836.  
  837. sub dprint_list
  838. {
  839.     my ( $level,$name,@output) = @_;
  840.        
  841.     if( $debug > $level )
  842.     {
  843.         dprint($level,"$name:\n");
  844.  
  845.         foreach my $item (@output)
  846.         {
  847.             dprint( $level, " $item\n");
  848.         }
  849.     }
  850.  
  851.     return;
  852. }
  853.  
  854. sub bail_with_fans_full
  855. {
  856.     dprint( 0, "Setting fans full before bailing!\n");
  857.     set_fan_mode("full");
  858.     die @_;
  859. }
  860.  
  861. sub get_fan_mode
  862. {
  863.     my $command = "$ipmitool raw 0x30 0x45 0";
  864.     my $fan_code = `$command`;
  865.    
  866.     if ($fan_code == 01) { $hd_fan_mode = "Full"; }
  867.     elsif ($fan_code == 00) { $hd_fan_mode = " Std"; }
  868.     elsif ($fan_code == 02) { $hd_fan_mode = " Opt"; }
  869.     elsif ($fan_code == 04) { $hd_fan_mode = " Hvy"; }
  870.    
  871.     return $hd_fan_mode;
  872. }
  873.  
  874. sub get_fan_mode_code
  875. {
  876.     my ( $fan_mode )  = @_;
  877.     my $m;
  878.  
  879.     if(     $fan_mode eq    'standard' )    { $m = 0; }
  880.     elsif(    $fan_mode eq    'full' )     { $m = 1; }
  881.     elsif(    $fan_mode eq    'optimal' )     { $m = 2; }
  882.     elsif(    $fan_mode eq    'heavyio' )    { $m = 4; }
  883.     else                     { die "illegal fan mode: $fan_mode\n" }
  884.  
  885.     dprint( 3, "fanmode: $fan_mode = $m\n");
  886.  
  887.     return $m;
  888. }
  889.  
  890. sub set_fan_mode
  891. {
  892.     my ($fan_mode) = @_;
  893.     my $mode = get_fan_mode_code( $fan_mode );
  894.  
  895.     dprint( 1, "Setting fan mode to $mode ($fan_mode)\n");
  896.     `$ipmitool raw 0x30 0x45 0x01 $mode`;
  897.  
  898.     sleep 5;    #need to give the BMC some breathing room
  899.  
  900.     return;
  901. }    
  902.  
  903. # returns the maximum core temperature from the kernel to determine CPU temperature.
  904. # in my testing I found that the max core temperature was pretty much the same as the IPMI 'CPU Temp'
  905. # value, but its much quicker to read, and doesn't require X10 IPMI. And works when the IPMI is rebooting too.
  906. sub get_cpu_temp_sysctl
  907. {
  908.     # significantly more efficient to filter to dev.cpu than to just grep the whole lot!
  909. #    my $core_temps = `sysctl -a dev.cpu | egrep -E \"dev.cpu\.[0-9]+\.temperature\" | awk '{print \$2}' | sed 's/.\$//'`;
  910.     my $core_temps = `sensors | awk '/Core/{ print int(\$3) }'`;
  911.     chomp($core_temps);
  912.  
  913.     dprint(3,"core_temps:\n$core_temps\n");
  914.  
  915.     my @core_temps_list = split(" ", $core_temps);
  916.    
  917.     dprint_list( 4, "core_temps_list", @core_temps_list );
  918.  
  919.     my $max_core_temp = 0;
  920.    
  921.     foreach my $core_temp (@core_temps_list)
  922.     {
  923.         if( $core_temp )
  924.         {
  925.             dprint( 2, "core_temp = $core_temp C\n");
  926.            
  927.             $max_core_temp = $core_temp if $core_temp > $max_core_temp;
  928.         }
  929.     }
  930.  
  931.     dprint(1, "CPU Temp: $max_core_temp\n");
  932.  
  933.     $last_cpu_temp = $max_core_temp; #possible that this is 0 if there was a fault reading the core temps
  934.  
  935.     return $max_core_temp;
  936. }
  937.  
  938. # reads the IPMI 'CPU Temp' field to determine overall CPU temperature
  939. sub get_cpu_temp_ipmi
  940. {
  941.     my $cpu_temp = `$ipmitool sensor get \"CPU1 Temp" | awk '/Sensor Reading/{print \$4}'`;
  942.     chomp $cpu_temp;
  943.  
  944.     dprint( 1, "CPU Temp: $cpu_temp\n");
  945.    
  946.     $last_cpu_temp = $cpu_temp; # note, this hasn't been cleaned.
  947.     return $cpu_temp;
  948. }
  949.  
  950. sub decide_cpu_fan_level
  951. {
  952.     my ($cpu_temp, $cpu_fan) = @_;
  953.    
  954.     if ($cpu_fan_override == 1)
  955.     {
  956.         $cpu_fan = "high";
  957.         dprint( 0, "CPU fan set to high to help cool HDs.\n");
  958.     }
  959.     else
  960.     {
  961.         #if cpu_temp evaluates as "0", its most likely the reading returned rubbish.
  962.         if ($cpu_temp <= 0)
  963.         {
  964.             if( $cpu_temp eq "No")    # "No reading"
  965.             {
  966.                 dprint( 0, "CPU Temp has no reading.\n");
  967.             }
  968.             elsif( $cpu_temp eq "Disabled" )
  969.             {
  970.                 dprint( 0, "CPU Temp reading disabled.\n");
  971.             }
  972.             else
  973.             {
  974.                 dprint( 0, "Unexpected CPU Temp ($cpu_temp).\n");
  975.             }
  976.             dprint( 0, "Assuming worst-case and going high.\n");
  977.             $cpu_fan = "high";
  978.         }
  979.         else
  980.         {
  981.             if( $cpu_temp >= $high_cpu_temp )
  982.             {
  983.                 if( $cpu_fan ne "high" )
  984.                 {
  985.                     dprint( 0, "CPU Temp: $cpu_temp >= $high_cpu_temp, CPU Fan going high.\n");
  986.                 }
  987.                 $cpu_fan = "high";
  988.             }
  989.             elsif( $cpu_temp >= $med_cpu_temp )
  990.             {
  991.                 if( $cpu_fan ne "med" )
  992.                 {
  993.                     dprint( 0, "CPU Temp: $cpu_temp >= $med_cpu_temp, CPU Fan going med.\n");
  994.                 }
  995.                 $cpu_fan = "med";
  996.             }
  997.             elsif( $cpu_temp > $low_cpu_temp && ($cpu_fan eq "high" || $cpu_fan eq "" ) )
  998.             {
  999.                 dprint( 0, "CPU Temp: $cpu_temp dropped below $med_cpu_temp, CPU Fan going med.\n");
  1000.            
  1001.                 $cpu_fan = "med";
  1002.             }
  1003.             elsif( $cpu_temp <= $low_cpu_temp )
  1004.             {
  1005.                 if( $cpu_fan ne "low" )
  1006.                 {
  1007.                     dprint( 0, "CPU Temp: $cpu_temp <= $low_cpu_temp, CPU Fan going low.\n");
  1008.                 }
  1009.                 $cpu_fan = "low";
  1010.             }
  1011.         }
  1012.     }
  1013.        
  1014.     dprint( 1, "CPU Fan: $cpu_fan\n");
  1015.  
  1016.     return $cpu_fan;
  1017. }
  1018.  
  1019. # zone,dutycycle%
  1020. sub set_fan_zone_duty_cycle
  1021. {
  1022.     my ( $zone, $duty ) = @_;
  1023.    
  1024.     if( $zone < 16 || $zone > 17 )
  1025.     {
  1026.         bail_with_fans_full( "Illegal Fan Zone" );
  1027.     }
  1028.  
  1029.     if( $duty < 0 || $duty > 255 )
  1030.     {
  1031.         dprint( 0, "illegal duty cycle, assuming 100%\n");
  1032.         $duty = 255;
  1033.     }
  1034.        
  1035.     dprint( 1, "Setting Zone $zone duty cycle to $duty%\n");
  1036.  
  1037.     `$ipmitool raw 0x30 0x91 0x5A 0x3 $zone $duty`;
  1038.    
  1039.     return;
  1040. }
  1041.  
  1042.  
  1043. sub set_fan_zone_level
  1044. {
  1045.     my ( $fan_zone, $level) = @_;
  1046.     my $duty = 0;
  1047.    
  1048.     #assumes high if not low or med, for safety.
  1049.     if( $level eq "low" )
  1050.     {
  1051.         $duty = $fan_duty_low;
  1052.     }
  1053.     elsif( $level eq "med" )
  1054.     {
  1055.         $duty = $fan_duty_med;
  1056.     }
  1057.     else
  1058.     {
  1059.         $duty = $fan_duty_high;
  1060.     }
  1061.  
  1062.     set_fan_zone_duty_cycle( $fan_zone, $duty );
  1063. }
  1064.  
  1065. sub get_fan_header_by_name
  1066. {
  1067.     my ($fan_name) = @_;
  1068.    
  1069.     if( $fan_name eq "CPU" )
  1070.     {
  1071.         return $cpu_fan_header;
  1072.     }
  1073.     elsif( $fan_name eq "HD" )
  1074.     {
  1075.         return $hd_fan_header;
  1076.     }
  1077.     else
  1078.     {
  1079.         bail_with_full_fans( "No such fan : $fan_name\n" );
  1080.     }
  1081. }
  1082.  
  1083. sub get_fan_speed
  1084. {
  1085.     my ($fan_name) = @_;
  1086.    
  1087.     my $fan = get_fan_header_by_name( $fan_name );
  1088.  
  1089.     my $command = "$ipmitool sdr | grep $fan";
  1090.     dprint( 4, "get fan speed command = $command\n");
  1091.  
  1092.      my $output = `$command`;
  1093.       my @vals = split(" ", $output);
  1094.       my $fan_speed = "$vals[2]";
  1095.  
  1096.     dprint( 3, "fan_speed = $fan_speed\n");
  1097.  
  1098.  
  1099.     if( $fan_speed eq "no" )
  1100.     {
  1101.         dprint( 0, "$fan_name Fan speed: No reading\n");
  1102.         $fan_speed = -1;
  1103.     }
  1104.     elsif( $fan_speed eq "disabled" )
  1105.     {
  1106.         dprint( 0, "$fan_name Fan speed: Disabled\n");
  1107.         $fan_speed = -1;
  1108.  
  1109.     }
  1110.     elsif( $fan_speed > 10000 || $fan_speed < 0 )
  1111.     {
  1112.         dprint( 0, "$fan_name Fan speed: $fan_speed RPM, is nonsensical\n");
  1113.         $fan_speed = -1;
  1114.     }
  1115.     else    
  1116.     {
  1117.         dprint( 1, "$fan_name Fan speed: $fan_speed RPM\n");
  1118.     }
  1119.    
  1120.     return $fan_speed;
  1121. }
  1122.  
  1123. sub get_fan_speed2
  1124. # get fan speed for specified fan header
  1125. {
  1126.     my ($fan_name) = @_;
  1127.    
  1128.     my $command = "$ipmitool sdr | grep $fan_name";
  1129.  
  1130.     my $output = `$command`;
  1131.     my @vals = split(" ", $output);
  1132.     my $fan_speed = "$vals[2]";
  1133.    
  1134.     return $fan_speed;
  1135. }
  1136.  
  1137. sub reset_bmc
  1138. {
  1139.     #when the BMC reboots, it comes back up in its last fan mode... which should be FULL.
  1140.  
  1141.     dprint( 0, "Resetting BMC\n");
  1142.     `$ipmitool bmc reset cold`;
  1143.    
  1144.     return;
  1145. }
RAW Paste Data