Guest User

Windows Server 2003/2008 R1/R2 DHCP Failover/Watchdog

a guest
Nov 15th, 2012
285
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Winbatch 15.08 KB | None | 0 0
  1. :: Purpose:         DHCP server Watchdog & Failover script. Read notes below
  2. :: Requirements:    1. Domain administrator credentials & "Logon as a batch job" rights
  3. ::                  2. Proper firewall configuration to allow connection
  4. ::                  3. Proper permissions on the DHCP backup directory
  5. :: Author:          vocatus on Reddit
  6. :: Version:         1.1c + Added quotes around all variables that could contain paths
  7. ::                       + Added full path to SC.exe to prevent failure in the event %PATH% gets corrupted or mangled (this happened in testing)
  8. ::                       * Fixed a glitch that could occur when pinging an assumed-down primary server that would incorrectly think it was back up
  9. ::                       - Removed almost every entry of "2>&1" since it's really not needed
  10. ::                  1.1b - Changed DATE to CUR_DATE format to be consistent with all other scripts
  11. ::                  1.1  - Comments improvement
  12. ::                       / Tuned some parameters (ping count on checking)
  13. ::                       / Some logging tweaks
  14. ::                       / Renamed FAILOVER_DELAY to FAILOVER_RECHECK_DELAY for clarity
  15. ::                  1.0d * Some logging tweaks
  16. ::                  1.0c * Some logging tweaks
  17. ::                  1.0 Initial write
  18. :: Notes:           I wrote this script after failing to find a satisfactory method of performing
  19. ::                  watchdog/failover between two Windows Server 2008 R2 DHCP servers.
  20. ::                
  21. :: Use:             This script has two modes: "Watchdog" and "Failover."
  22. ::                  - Watchdog checks the status of the remote DHCP service, logs it, and then grabs the remote DHCP db backup file and imports it.
  23. ::                  - Failover mode is activated when the script cannot determine the status of the remote DHCP server. The script then activates
  24. ::                    the local DHCP server with the latest backup copy it successfully retrieved from the primary server.
  25. ::                  
  26. :: Instructions:
  27. ::                  1. Tune the variables in this script to your desired backup location and frequency
  28. ::                  2. On the primary server: set the DHCP backup interval to your desired backup frequency. The value is in minutes; I recommend 5 minutes.
  29. ::                     You do this by modifying this registry key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\DHCPServer\Parameters\BackupInterval
  30. ::                  3. On the backup server:  set this script to run as a scheduled task. I recommend every 10 minutes.
  31. :: Notice:
  32. ::                 !! Make sure to set it only to run if it isn't already running! If there is a failover you could have
  33. ::                    Task Scheduler spawn a new instance of the script every n minutes and end up with hundreds of copies
  34. ::                    of this script running.
  35.  
  36.  
  37. :: Prep
  38. SETLOCAL
  39. @echo off
  40. cls
  41. set VERSION=1.1c
  42. title [DHCP Server Watchdog v%VERSION%]
  43.  
  44.  
  45. :::::::::::::::
  46. :: Variables :: - Set these. Do not use trailing slashes (\) in directory names (this is important!).
  47. :::::::::::::::
  48.  
  49. :: Remote server is the PRIMARY DHCP server we're watching. Use a hostname or IP address.
  50. set REMOTE_SERVER=my-dhcp-server
  51.  
  52. :: Location of the DHCP backup on the remote primary server
  53. :: Best practice is to leave as default, unless you have a custom backup location.
  54. :: The script builds the backup line like this: \\%REMOTE_SERVER%\c$\%REMOTE_BACKUP_PATH%
  55. set REMOTE_BACKUP_PATH=Windows\system32\dhcp\backup
  56.  
  57. :: Location of the local backup. I normally copy directly to my backup server's DHCP directory.
  58. :: The script builds the local backup line like this: c:\windows\system32\dhcp\[backup folders]
  59. set LOCAL_BACKUP_PATH=%SystemRoot%\system32\dhcp
  60.  
  61. :: When a failover is triggered, how many seconds should we wait in between each attempt to contact the primary server again?
  62. set FAILOVER_RECHECK_DELAY=15
  63.  
  64. :: Log options. Don't put an extension on the log file name. (Important!) The script sets this later on.
  65. set LOGPATH=%SystemDrive%\Logs
  66. set LOGFILE=%COMPUTERNAME%_DHCP_watchdog
  67.  
  68. :: Max log file size allowed (in bytes) before rotation and archive. I recommend setting this to 2 MB (2097152).
  69. :: Example: 524288 is half a megabyte (~500KB)
  70. set LOG_MAX_SIZE=2097152
  71.  
  72. :: \/ Don't touch anything below this line. If you do, you will break something.
  73. set CUR_DATE=%DATE:~-4%-%DATE:~4,2%-%DATE:~7,2%
  74.  
  75.  
  76. :::::::::::::::::::::::
  77. :: LOG FILE HANDLING :: - This section handles the log file
  78. :::::::::::::::::::::::
  79.  
  80. :: Make the logfile if it doesn't exist
  81. if not exist %LOGPATH% mkdir %LOGPATH%
  82. if not exist %LOGPATH%\%LOGFILE%.log goto new_log
  83.  
  84. :: Check log size. If it hasn't exceeded our size limit, jump straight to Watchdog mode
  85. for %%R in (%LOGPATH%\%LOGFILE%.log) do if %%~zR LSS %LOG_MAX_SIZE% goto newrun
  86.  
  87. :: However, if the log was too big, go ahead and rotate it.
  88. pushd %LOGPATH%
  89. del %LOGFILE%.ancient 2>NUL
  90. rename %LOGFILE%.oldest %LOGFILE%.ancient 2>NUL
  91. rename %LOGFILE%.older %LOGFILE%.oldest 2>NUL
  92. rename %LOGFILE%.old %LOGFILE%.older 2>NUL
  93. rename %LOGFILE%.log %LOGFILE%.old 2>NUL
  94. popd
  95.  
  96. :: And then create the header for the new log file
  97. :new_log
  98. echo ------------------------------------------------------------------------------------->> %LOGPATH%\%LOGFILE%.log
  99. echo  Initializing new DHCP Server Watchdog log on %CUR_DATE% at %TIME%, max log size %LOG_MAX_SIZE% bytes>> %LOGPATH%\%LOGFILE%.log
  100. echo ------------------------------------------------------------------------------------->> %LOGPATH%\%LOGFILE%.log
  101. echo.>> %LOGPATH%\%LOGFILE%.log
  102.  
  103. :: New run section - if we just launched the script, write a header for this run
  104. :newrun
  105. echo ------------------------------------------------------------------------------------->> %LOGPATH%\%LOGFILE%.log
  106. echo  DHCP Server Watchdog v%VERSION%, %CUR_DATE%>> %LOGPATH%\%LOGFILE%.log
  107. echo   Running as %USERDOMAIN%\%USERNAME% on %COMPUTERNAME%>> %LOGPATH%\%LOGFILE%.log
  108. echo.>> %LOGPATH%\%LOGFILE%.log
  109. echo  Job Options>> %LOGPATH%\%LOGFILE%.log
  110. echo   Log location:            %LOGPATH%\%LOGFILE%.log>> %LOGPATH%\%LOGFILE%.log
  111. echo   Log max size:            %LOG_MAX_SIZE% bytes>> %LOGPATH%\%LOGFILE%.log
  112. echo   Watching primary server: %REMOTE_SERVER%>> %LOGPATH%\%LOGFILE%.log
  113. echo   Mirroring this DHCP db:  %REMOTE_BACKUP_PATH%>> %LOGPATH%\%LOGFILE%.log
  114. echo   Local backup location:   %LOCAL_BACKUP_PATH%>> %LOGPATH%\%LOGFILE%.log
  115. echo ------------------------------------------------------------------------------------->> %LOGPATH%\%LOGFILE%.log
  116. echo %TIME%         Starting Watchdog mode.>> %LOGPATH%\%LOGFILE%.log
  117. echo.
  118. echo  DHCP Server Watchdog v%VERSION%
  119. echo   Running as: %USERDOMAIN%\%USERNAME% on %COMPUTERNAME%
  120. echo   Log:        %LOGPATH%\%LOGFILE%.log
  121.  
  122.  
  123. :::::::::::::::::::
  124. :: WATCHDOG MODE ::
  125. :::::::::::::::::::
  126.  
  127. :watchdog
  128.  
  129. :: Ping the server to see if it's up
  130. echo.
  131. echo   Verifying proper operation of DHCP server on %REMOTE_SERVER%, please wait...
  132. echo.
  133. echo %TIME%         Pinging %REMOTE_SERVER%...>> %LOGPATH%\%LOGFILE%.log
  134. echo %TIME%         Pinging %REMOTE_SERVER%...
  135. ping %REMOTE_SERVER% -n 2 >NUL
  136. if %ERRORLEVEL%==1 echo %TIME% WARNING %REMOTE_SERVER% failed to respond to ping. && echo %TIME% WARNING %REMOTE_SERVER% failed to respond to ping.>> %LOGPATH%\%LOGFILE%.log
  137. if not %ERRORLEVEL%==1 echo %TIME% SUCCESS %REMOTE_SERVER% responded to ping. && echo %TIME% SUCCESS %REMOTE_SERVER% responded to ping.>> %LOGPATH%\%LOGFILE%.log
  138.  
  139. :: Check & Log
  140. echo %TIME%         Checking DHCP server status on %REMOTE_SERVER%...>> %LOGPATH%\%LOGFILE%.log
  141. echo %TIME%         Checking DHCP server status on %REMOTE_SERVER%...
  142.  
  143. :: Reset ERRORLEVEL back to 0
  144. ver > NUL
  145.  
  146. :: Use "SC" to check the status of "Dhcpserver" service, find the "RUNNING" state, and act accordingly based on the return code
  147. %WINDIR%\System32\sc.exe \\%REMOTE_SERVER% query Dhcpserver | find "RUNNING" >NUL
  148. if %ERRORLEVEL%==0 echo %TIME% SUCCESS The DHCP service is running on %REMOTE_SERVER%.>> %LOGPATH%\%LOGFILE%.log
  149. if %ERRORLEVEL%==0 echo %TIME% SUCCESS The DHCP service is running on %REMOTE_SERVER%.
  150.  
  151. :: This section only executes if the test failed.
  152. if not %ERRORLEVEL%==0 (
  153.     echo %TIME% FAILURE The DHCP service is not running on %REMOTE_SERVER%.>> %LOGPATH%\%LOGFILE%.log
  154.     echo %TIME%         Activating failover procedure. Local DHCP server will be initialized using most recent successful backup.>> %LOGPATH%\%LOGFILE%.log
  155.     echo %TIME% FAILURE The DHCP service is not running on %REMOTE_SERVER%.
  156.     echo %TIME%         Activating failover procedure. Local DHCP server will be initialized using most recent successful backup.
  157.     goto failover
  158.     )
  159.  
  160. :: Reset ERRORLEVEL back to 0
  161. ver > NUL
  162.  
  163. :: Fetch
  164. echo %TIME%         Fetching DHCP database backup from %REMOTE_SERVER%...>> %LOGPATH%\%LOGFILE%.log
  165. echo %TIME%         Fetching DHCP database backup from %REMOTE_SERVER%...
  166. xcopy "\\%REMOTE_SERVER%\c$\%REMOTE_BACKUP_PATH%\*" "%LOCAL_BACKUP_PATH%\backup_new_pending\" /E /Y /Q >NUL
  167.  
  168. :: If the copy SUCCEEDED, this executes
  169. if %ERRORLEVEL%==0 (
  170.     echo %TIME% SUCCESS Backup fetched from %REMOTE_SERVER%.>> %LOGPATH%\%LOGFILE%.log
  171.     echo %TIME% SUCCESS Backup fetched from %REMOTE_SERVER%.
  172.     echo %TIME%         Rotating database backups...>> %LOGPATH%\%LOGFILE%.log
  173.     echo %TIME%         Rotating database backups...
  174.     :: Rotate backups and use newest copy
  175.     rmdir /S /Q %LOCAL_BACKUP_PATH%\backup5
  176.     if exist "%LOCAL_BACKUP_PATH%\backup4" move /Y "%LOCAL_BACKUP_PATH%\backup4" "%LOCAL_BACKUP_PATH%\backup5"
  177.     if exist "%LOCAL_BACKUP_PATH%\backup3" move /Y "%LOCAL_BACKUP_PATH%\backup3" "%LOCAL_BACKUP_PATH%\backup4"
  178.     if exist "%LOCAL_BACKUP_PATH%\backup2" move /Y "%LOCAL_BACKUP_PATH%\backup2" "%LOCAL_BACKUP_PATH%\backup3"
  179.     if exist "%LOCAL_BACKUP_PATH%\backup" move /Y "%LOCAL_BACKUP_PATH%\backup" "%LOCAL_BACKUP_PATH%\backup2"
  180.     move /Y "%LOCAL_BACKUP_PATH%\backup_new_pending" "%LOCAL_BACKUP_PATH%\backup" >NUL
  181.     echo %TIME%         Database backups rotated.>> %LOGPATH%\%LOGFILE%.log
  182.     echo %TIME%         Database backups rotated.
  183.     )
  184.  
  185. :: If the copy FAILED, this executes:
  186. if not %ERRORLEVEL%==0 (
  187.     echo %TIME% WARNING There was an error copying the backup from %REMOTE_SERVER%.>> %LOGPATH%\%LOGFILE%.log
  188.     echo %TIME%         You may want to look into this since we were able to check the DHCPserver service status but the file copy failed.>> %LOGPATH%\%LOGFILE%.log
  189.     echo %TIME%         Skipping new database import due to copy failure.>> %LOGPATH%\%LOGFILE%.log
  190.     echo %TIME%         Job complete with errors.>> %LOGPATH%\%LOGFILE%.log
  191.     echo %TIME% WARNING There was an error copying the backup from %REMOTE_SERVER%.
  192.     echo %TIME%         You may want to look into this since we were able to check the DHCPserver service status but the file copy failed.
  193.     echo %TIME%         Skipping new database import due to copy failure.
  194.     echo %TIME%         Job complete with errors.
  195.     )
  196.    
  197. :: Import database
  198. echo %TIME%         Starting local DHCP server to import new database...>> %LOGPATH%\%LOGFILE%.log
  199. echo %TIME%         Starting local DHCP server to import new database...
  200.     net start Dhcpserver
  201. echo %TIME%         Local DHCP server running. Performing import...>> %LOGPATH%\%LOGFILE%.log
  202. echo %TIME%         Local DHCP server running. Performing import...
  203.     netsh dhcp server restore "%LOCAL_BACKUP_PATH%\backup"
  204. echo %TIME%         Import complete.>> %LOGPATH%\%LOGFILE%.log
  205. echo %TIME%         Import complete.
  206. echo %TIME%         Stopping local DHCP server...>> %LOGPATH%\%LOGFILE%.log
  207. echo %TIME%         Stopping local DHCP server...
  208.     net stop Dhcpserver
  209. echo %TIME%         Local DHCP server stopped.>> %LOGPATH%\%LOGFILE%.log
  210. echo %TIME%         Local DHCP server stopped.
  211. echo %TIME% SUCCESS Job complete, DHCP database backed up and ready for use. Exiting.>> %LOGPATH%\%LOGFILE%.log
  212. echo %TIME% SUCCESS Job complete, DHCP database backed up and ready for use. Exiting.
  213. goto EOF
  214.  
  215.  
  216. :::::::::::::::::::
  217. :: FAILOVER MODE ::
  218. :::::::::::::::::::
  219.  
  220. :failover
  221. :: Log this AND display to console
  222. echo %TIME% WARNING Failover activated.>> %LOGPATH%\%LOGFILE%.log
  223. echo %TIME%         Starting local DHCP server using most recent successful backup...>> %LOGPATH%\%LOGFILE%.log
  224. echo.
  225. echo %TIME% WARNING Could not contact primary DHCP server "%REMOTE_SERVER%." Failover activated.
  226. echo %TIME%         Starting local DHCP server using most recent successful backup...
  227. echo.
  228.     net start Dhcpserver
  229. echo %TIME%         Local DHCP server started.>> %LOGPATH%\%LOGFILE%.log
  230. echo %TIME%         Entering monitoring loop. Checking if %REMOTE_SERVER% is back up every %FAILOVER_RECHECK_DELAY% seconds...>> %LOGPATH%\%LOGFILE%.log
  231. echo %TIME%         Local DHCP server started.
  232. echo %TIME%         Entering monitoring loop. Checking if %REMOTE_SERVER% is back up every %FAILOVER_RECHECK_DELAY% seconds...
  233.  
  234.  
  235. :failover_loop
  236. :: First we ping the server
  237. ping %REMOTE_SERVER% -n 3 >NUL
  238. :: If no ping response, this section executes
  239. IF NOT %ERRORLEVEL%==0 (
  240.     echo %TIME% FAILURE No ping response from %REMOTE_SERVER%. Waiting %FAILOVER_RECHECK_DELAY% seconds to check again.>> %LOGPATH%\%LOGFILE%.log
  241.     echo %TIME% FAILURE No ping response from %REMOTE_SERVER%. Waiting %FAILOVER_RECHECK_DELAY% seconds to check again.
  242.     ping localhost -n %FAILOVER_RECHECK_DELAY% >NUL
  243.     goto failover_loop
  244.     )
  245.  
  246. :: If yes ping response, this section executes
  247. :: This declaration is required to get the nested IF ERRORLEVEL test to function correctly
  248. SETLOCAL ENABLEDELAYEDEXPANSION
  249. if not %ERRORLEVEL%==1 (
  250.     echo %TIME% NOTICE %REMOTE_SERVER% is responding to pings.>> %LOGPATH%\%LOGFILE%.log
  251.     echo %TIME% NOTICE %REMOTE_SERVER% is responding to pings.
  252.     echo %TIME%        Checking DHCP server status on %REMOTE_SERVER%...>> %LOGPATH%\%LOGFILE%.log
  253.     echo %TIME%        Checking DHCP server status on %REMOTE_SERVER%...
  254.    
  255.     :: This section checks to see if the Dhcpserver service is back up and acts accordingly
  256.     %WINDIR%\System32\sc.exe \\%REMOTE_SERVER% query Dhcpserver | find "RUNNING" >NUL
  257.         :: The exclamation points around ERRORLEVEL here prevent it from incorrectly being expanded using the external ERRORLEVEL results from the first IF statement
  258.         if !ERRORLEVEL!==0 (
  259.                 echo %TIME% SUCCESS The DHCP service is running on %REMOTE_SERVER%.>> %LOGPATH%\%LOGFILE%.log
  260.                 echo %TIME% SUCCESS The DHCP service is running on %REMOTE_SERVER%.
  261.                 echo %TIME%         The primary DHCP server %REMOTE_SERVER% is back up. Stopping local DHCP service...>> %LOGPATH%\%LOGFILE%.log
  262.                 echo %TIME%         The primary DHCP server %REMOTE_SERVER% is back up. Stopping local DHCP service...
  263.                 net stop Dhcpserver
  264.                 echo %TIME%         Recovery complete. Exiting.>> %LOGPATH%\%LOGFILE%.log
  265.                 echo %TIME%         Recovery complete. Exiting.
  266.                 goto EOF
  267.                 )
  268.     )
  269. ENDLOCAL
  270.  
  271. :: If the host responds to pings but the DHCP service isn't running, this executes
  272. echo %TIME% FAILURE %REMOTE_SERVER% is responding to pings, but DHCP isn't responding (yet?). Will try again in %FAILOVER_RECHECK_DELAY% seconds.>> %LOGPATH%\%LOGFILE%.log
  273. echo %TIME% FAILURE %REMOTE_SERVER% is responding to pings, but DHCP isn't responding (yet?). Will try again in %FAILOVER_RECHECK_DELAY% seconds.
  274. ver >NUL
  275. goto failover_loop
  276.  
  277. ENDLOCAL
  278. echo.>> %LOGPATH%\%LOGFILE%.log
  279. :EOF
Add Comment
Please, Sign In to add comment