Cogger

Anti-Spam_Find.sh

Nov 22nd, 2022
146
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Bash 63.54 KB | None | 0 0
  1. #!/bin/bash
  2.  
  3. # $Id: is_spammer.bash,v 1.12.2.11 2004/10/01 21:42:33 mszick Exp $
  4. # Above line is RCS info.
  5.  
  6. # The latest version of this script is available from http://www.morethan.org.
  7. #
  8. # Spammer-identification
  9. # by Michael S. Zick
  10. # Used in the ABS Guide with permission.
  11.  
  12.  
  13.  
  14. #######################################################
  15. # Documentation
  16. # See also "Quickstart" at end of script.
  17. #######################################################
  18.  
  19. :<<-'__is_spammer_Doc_'
  20.  
  21.     Copyright (c) Michael S. Zick, 2004
  22.     License: Unrestricted reuse in any form, for any purpose.
  23.     Warranty: None -{Its a script; the user is on their own.}-
  24.  
  25. Impatient?
  26.     Application code: goto "# # # Hunt the Spammer' program code # # #"
  27.     Example output: ":<<-'_is_spammer_outputs_'"
  28.     How to use: Enter script name without arguments.
  29.                 Or goto "Quickstart" at end of script.
  30.  
  31. Provides
  32.     Given a domain name or IP(v4) address as input:
  33.  
  34.     Does an exhaustive set of queries to find the associated
  35.     network resources (short of recursing into TLDs).
  36.  
  37.     Checks the IP(v4) addresses found against Blacklist
  38.     nameservers.
  39.  
  40.     If found to be a blacklisted IP(v4) address,
  41.     reports the blacklist text records.
  42.     (Usually hyper-links to the specific report.)
  43.  
  44. Requires
  45.     A working Internet connection.
  46.     (Exercise: Add check and/or abort if not on-line when running script.)
  47.     Bash with arrays (2.05b+).
  48.  
  49.     The external program 'dig' --
  50.     a utility program provided with the 'bind' set of programs.
  51.     Specifically, the version which is part of Bind series 9.x
  52.     See: http://www.isc.org
  53.  
  54.     All usages of 'dig' are limited to wrapper functions,
  55.     which may be rewritten as required.
  56.     See: dig_wrappers.bash for details.
  57.          ("Additional documentation" -- below)
  58.  
  59. Usage
  60.     Script requires a single argument, which may be:
  61.     1) A domain name;
  62.     2) An IP(v4) address;
  63.     3) A filename, with one name or address per line.
  64.  
  65.     Script accepts an optional second argument, which may be:
  66.     1) A Blacklist server name;
  67.     2) A filename, with one Blacklist server name per line.
  68.  
  69.     If the second argument is not provided, the script uses
  70.     a built-in set of (free) Blacklist servers.
  71.  
  72.     See also, the Quickstart at the end of this script (after 'exit').
  73.  
  74. Return Codes
  75.     0 - All OK
  76.     1 - Script failure
  77.     2 - Something is Blacklisted
  78.  
  79. Optional environment variables
  80.     SPAMMER_TRACE
  81.         If set to a writable file,
  82.         script will log an execution flow trace.
  83.  
  84.     SPAMMER_DATA
  85.         If set to a writable file, script will dump its
  86.         discovered data in the form of GraphViz file.
  87.         See: http://www.research.att.com/sw/tools/graphviz
  88.  
  89.     SPAMMER_LIMIT
  90.         Limits the depth of resource tracing.
  91.  
  92.         Default is 2 levels.
  93.  
  94.         A setting of 0 (zero) means 'unlimited' . . .
  95.           Caution: script might recurse the whole Internet!
  96.  
  97.         A limit of 1 or 2 is most useful when processing
  98.         a file of domain names and addresses.
  99.         A higher limit can be useful when hunting spam gangs.
  100.  
  101.  
  102. Additional documentation
  103.     Download the archived set of scripts
  104.     explaining and illustrating the function contained within this script.
  105.     http://bash.deta.in/mszick_clf.tar.bz2
  106.  
  107.  
  108. Study notes
  109.     This script uses a large number of functions.
  110.     Nearly all general functions have their own example script.
  111.     Each of the example scripts have tutorial level comments.
  112.  
  113. Scripting project
  114.     Add support for IP(v6) addresses.
  115.     IP(v6) addresses are recognized but not processed.
  116.  
  117. Advanced project
  118.     Add the reverse lookup detail to the discovered information.
  119.  
  120.     Report the delegation chain and abuse contacts.
  121.  
  122.     Modify the GraphViz file output to include the
  123.     newly discovered information.
  124.  
  125. __is_spammer_Doc_
  126.  
  127. #######################################################
  128.  
  129.  
  130.  
  131.  
  132. #### Special IFS settings used for string parsing. ####
  133.  
  134. # Whitespace == :Space:Tab:Line Feed:Carriage Return:
  135. WSP_IFS=$'\x20'$'\x09'$'\x0A'$'\x0D'
  136.  
  137. # No Whitespace == Line Feed:Carriage Return
  138. NO_WSP=$'\x0A'$'\x0D'
  139.  
  140. # Field separator for dotted decimal IP addresses
  141. ADR_IFS=${NO_WSP}'.'
  142.  
  143. # Array to dotted string conversions
  144. DOT_IFS='.'${WSP_IFS}
  145.  
  146. # # # Pending operations stack machine # # #
  147. # This set of functions described in func_stack.bash.
  148. # (See "Additional documentation" above.)
  149. # # #
  150.  
  151. # Global stack of pending operations.
  152. declare -f -a _pending_
  153. # Global sentinel for stack runners
  154. declare -i _p_ctrl_
  155. # Global holder for currently executing function
  156. declare -f _pend_current_
  157.  
  158. # # # Debug version only - remove for regular use # # #
  159. #
  160. # The function stored in _pend_hook_ is called
  161. # immediately before each pending function is
  162. # evaluated.  Stack clean, _pend_current_ set.
  163. #
  164. # This thingy demonstrated in pend_hook.bash.
  165. declare -f _pend_hook_
  166. # # #
  167.  
  168. # The do nothing function
  169. pend_dummy() { : ; }
  170.  
  171. # Clear and initialize the function stack.
  172. pend_init() {
  173.     unset _pending_[@]
  174.     pend_func pend_stop_mark
  175.     _pend_hook_='pend_dummy'  # Debug only.
  176. }
  177.  
  178. # Discard the top function on the stack.
  179. pend_pop() {
  180.     if [ ${#_pending_[@]} -gt 0 ]
  181.     then
  182.         local -i _top_
  183.         _top_=${#_pending_[@]}-1
  184.         unset _pending_[$_top_]
  185.     fi
  186. }
  187.  
  188. # pend_func function_name [$(printf '%q\n' arguments)]
  189. pend_func() {
  190.     local IFS=${NO_WSP}
  191.     set -f
  192.     _pending_[${#_pending_[@]}]=$@
  193.     set +f
  194. }
  195.  
  196. # The function which stops the release:
  197. pend_stop_mark() {
  198.     _p_ctrl_=0
  199. }
  200.  
  201. pend_mark() {
  202.     pend_func pend_stop_mark
  203. }
  204.  
  205. # Execute functions until 'pend_stop_mark' . . .
  206. pend_release() {
  207.     local -i _top_             # Declare _top_ as integer.
  208.     _p_ctrl_=${#_pending_[@]}
  209.     while [ ${_p_ctrl_} -gt 0 ]
  210.     do
  211.        _top_=${#_pending_[@]}-1
  212.        _pend_current_=${_pending_[$_top_]}
  213.        unset _pending_[$_top_]
  214.        $_pend_hook_            # Debug only.
  215.        eval $_pend_current_
  216.     done
  217. }
  218.  
  219. # Drop functions until 'pend_stop_mark' . . .
  220. pend_drop() {
  221.     local -i _top_
  222.     local _pd_ctrl_=${#_pending_[@]}
  223.     while [ ${_pd_ctrl_} -gt 0 ]
  224.     do
  225.        _top_=$_pd_ctrl_-1
  226.        if [ "${_pending_[$_top_]}" == 'pend_stop_mark' ]
  227.        then
  228.            unset _pending_[$_top_]
  229.            break
  230.        else
  231.            unset _pending_[$_top_]
  232.            _pd_ctrl_=$_top_
  233.        fi
  234.     done
  235.     if [ ${#_pending_[@]} -eq 0 ]
  236.     then
  237.         pend_func pend_stop_mark
  238.     fi
  239. }
  240.  
  241. #### Array editors ####
  242.  
  243. # This function described in edit_exact.bash.
  244. # (See "Additional documentation," above.)
  245. # edit_exact <excludes_array_name> <target_array_name>
  246. edit_exact() {
  247.     [ $# -eq 2 ] ||
  248.     [ $# -eq 3 ] || return 1
  249.     local -a _ee_Excludes
  250.     local -a _ee_Target
  251.     local _ee_x
  252.     local _ee_t
  253.     local IFS=${NO_WSP}
  254.     set -f
  255.     eval _ee_Excludes=\( \$\{$1\[@\]\} \)
  256.     eval _ee_Target=\( \$\{$2\[@\]\} \)
  257.     local _ee_len=${#_ee_Target[@]}     # Original length.
  258.     local _ee_cnt=${#_ee_Excludes[@]}   # Exclude list length.
  259.     [ ${_ee_len} -ne 0 ] || return 0    # Can't edit zero length.
  260.     [ ${_ee_cnt} -ne 0 ] || return 0    # Can't edit zero length.
  261.     for (( x = 0; x < ${_ee_cnt} ; x++ ))
  262.     do
  263.         _ee_x=${_ee_Excludes[$x]}
  264.         for (( n = 0 ; n < ${_ee_len} ; n++ ))
  265.         do
  266.             _ee_t=${_ee_Target[$n]}
  267.             if [ x"${_ee_t}" == x"${_ee_x}" ]
  268.             then
  269.                 unset _ee_Target[$n]     # Discard match.
  270.                 [ $# -eq 2 ] && break    # If 2 arguments, then done.
  271.             fi
  272.         done
  273.     done
  274.     eval $2=\( \$\{_ee_Target\[@\]\} \)
  275.     set +f
  276.     return 0
  277. }
  278.  
  279. # This function described in edit_by_glob.bash.
  280. # edit_by_glob <excludes_array_name> <target_array_name>
  281. edit_by_glob() {
  282.     [ $# -eq 2 ] ||
  283.     [ $# -eq 3 ] || return 1
  284.     local -a _ebg_Excludes
  285.     local -a _ebg_Target
  286.     local _ebg_x
  287.     local _ebg_t
  288.     local IFS=${NO_WSP}
  289.     set -f
  290.     eval _ebg_Excludes=\( \$\{$1\[@\]\} \)
  291.     eval _ebg_Target=\( \$\{$2\[@\]\} \)
  292.     local _ebg_len=${#_ebg_Target[@]}
  293.     local _ebg_cnt=${#_ebg_Excludes[@]}
  294.     [ ${_ebg_len} -ne 0 ] || return 0
  295.     [ ${_ebg_cnt} -ne 0 ] || return 0
  296.     for (( x = 0; x < ${_ebg_cnt} ; x++ ))
  297.     do
  298.         _ebg_x=${_ebg_Excludes[$x]}
  299.         for (( n = 0 ; n < ${_ebg_len} ; n++ ))
  300.         do
  301.             [ $# -eq 3 ] && _ebg_x=${_ebg_x}'*'  #  Do prefix edit
  302.             if [ ${_ebg_Target[$n]:=} ]          #+ if defined & set.
  303.             then
  304.                 _ebg_t=${_ebg_Target[$n]/#${_ebg_x}/}
  305.                 [ ${#_ebg_t} -eq 0 ] && unset _ebg_Target[$n]
  306.             fi
  307.         done
  308.     done
  309.     eval $2=\( \$\{_ebg_Target\[@\]\} \)
  310.     set +f
  311.     return 0
  312. }
  313.  
  314. # This function described in unique_lines.bash.
  315. # unique_lines <in_name> <out_name>
  316. unique_lines() {
  317.     [ $# -eq 2 ] || return 1
  318.     local -a _ul_in
  319.     local -a _ul_out
  320.     local -i _ul_cnt
  321.     local -i _ul_pos
  322.     local _ul_tmp
  323.     local IFS=${NO_WSP}
  324.     set -f
  325.     eval _ul_in=\( \$\{$1\[@\]\} \)
  326.     _ul_cnt=${#_ul_in[@]}
  327.     for (( _ul_pos = 0 ; _ul_pos < ${_ul_cnt} ; _ul_pos++ ))
  328.     do
  329.         if [ ${_ul_in[${_ul_pos}]:=} ]      # If defined & not empty
  330.         then
  331.             _ul_tmp=${_ul_in[${_ul_pos}]}
  332.             _ul_out[${#_ul_out[@]}]=${_ul_tmp}
  333.             for (( zap = _ul_pos ; zap < ${_ul_cnt} ; zap++ ))
  334.             do
  335.                 [ ${_ul_in[${zap}]:=} ] &&
  336.                 [ 'x'${_ul_in[${zap}]} == 'x'${_ul_tmp} ] &&
  337.                     unset _ul_in[${zap}]
  338.             done
  339.         fi
  340.     done
  341.     eval $2=\( \$\{_ul_out\[@\]\} \)
  342.     set +f
  343.     return 0
  344. }
  345.  
  346. # This function described in char_convert.bash.
  347. # to_lower <string>
  348. to_lower() {
  349.     [ $# -eq 1 ] || return 1
  350.     local _tl_out
  351.     _tl_out=${1//A/a}
  352.     _tl_out=${_tl_out//B/b}
  353.     _tl_out=${_tl_out//C/c}
  354.     _tl_out=${_tl_out//D/d}
  355.     _tl_out=${_tl_out//E/e}
  356.     _tl_out=${_tl_out//F/f}
  357.     _tl_out=${_tl_out//G/g}
  358.     _tl_out=${_tl_out//H/h}
  359.     _tl_out=${_tl_out//I/i}
  360.     _tl_out=${_tl_out//J/j}
  361.     _tl_out=${_tl_out//K/k}
  362.     _tl_out=${_tl_out//L/l}
  363.     _tl_out=${_tl_out//M/m}
  364.     _tl_out=${_tl_out//N/n}
  365.     _tl_out=${_tl_out//O/o}
  366.     _tl_out=${_tl_out//P/p}
  367.     _tl_out=${_tl_out//Q/q}
  368.     _tl_out=${_tl_out//R/r}
  369.     _tl_out=${_tl_out//S/s}
  370.     _tl_out=${_tl_out//T/t}
  371.     _tl_out=${_tl_out//U/u}
  372.     _tl_out=${_tl_out//V/v}
  373.     _tl_out=${_tl_out//W/w}
  374.     _tl_out=${_tl_out//X/x}
  375.     _tl_out=${_tl_out//Y/y}
  376.     _tl_out=${_tl_out//Z/z}
  377.     echo ${_tl_out}
  378.     return 0
  379. }
  380.  
  381. #### Application helper functions ####
  382.  
  383. # Not everybody uses dots as separators (APNIC, for example).
  384. # This function described in to_dot.bash
  385. # to_dot <string>
  386. to_dot() {
  387.     [ $# -eq 1 ] || return 1
  388.     echo ${1//[#|@|%]/.}
  389.     return 0
  390. }
  391.  
  392. # This function described in is_number.bash.
  393. # is_number <input>
  394. is_number() {
  395.     [ "$#" -eq 1 ]    || return 1  # is blank?
  396.     [ x"$1" == 'x0' ] && return 0  # is zero?
  397.     local -i tst
  398.     let tst=$1 2>/dev/null         # else is numeric!
  399.     return $?
  400. }
  401.  
  402. # This function described in is_address.bash.
  403. # is_address <input>
  404. is_address() {
  405.     [ $# -eq 1 ] || return 1    # Blank ==> false
  406.     local -a _ia_input
  407.     local IFS=${ADR_IFS}
  408.     _ia_input=( $1 )
  409.     if  [ ${#_ia_input[@]} -eq 4 ]  &&
  410.         is_number ${_ia_input[0]}   &&
  411.         is_number ${_ia_input[1]}   &&
  412.         is_number ${_ia_input[2]}   &&
  413.         is_number ${_ia_input[3]}   &&
  414.         [ ${_ia_input[0]} -lt 256 ] &&
  415.         [ ${_ia_input[1]} -lt 256 ] &&
  416.         [ ${_ia_input[2]} -lt 256 ] &&
  417.         [ ${_ia_input[3]} -lt 256 ]
  418.     then
  419.         return 0
  420.     else
  421.         return 1
  422.     fi
  423. }
  424.  
  425. #  This function described in split_ip.bash.
  426. #  split_ip <IP_address>
  427. #+ <array_name_norm> [<array_name_rev>]
  428. split_ip() {
  429.     [ $# -eq 3 ] ||              #  Either three
  430.     [ $# -eq 2 ] || return 1     #+ or two arguments
  431.     local -a _si_input
  432.     local IFS=${ADR_IFS}
  433.     _si_input=( $1 )
  434.     IFS=${WSP_IFS}
  435.     eval $2=\(\ \$\{_si_input\[@\]\}\ \)
  436.     if [ $# -eq 3 ]
  437.     then
  438.         # Build query order array.
  439.         local -a _dns_ip
  440.         _dns_ip[0]=${_si_input[3]}
  441.         _dns_ip[1]=${_si_input[2]}
  442.         _dns_ip[2]=${_si_input[1]}
  443.         _dns_ip[3]=${_si_input[0]}
  444.         eval $3=\(\ \$\{_dns_ip\[@\]\}\ \)
  445.     fi
  446.     return 0
  447. }
  448.  
  449. # This function described in dot_array.bash.
  450. # dot_array <array_name>
  451. dot_array() {
  452.     [ $# -eq 1 ] || return 1     # Single argument required.
  453.     local -a _da_input
  454.     eval _da_input=\(\ \$\{$1\[@\]\}\ \)
  455.     local IFS=${DOT_IFS}
  456.     local _da_output=${_da_input[@]}
  457.     IFS=${WSP_IFS}
  458.     echo ${_da_output}
  459.     return 0
  460. }
  461.  
  462. # This function described in file_to_array.bash
  463. # file_to_array <file_name> <line_array_name>
  464. file_to_array() {
  465.     [ $# -eq 2 ] || return 1  # Two arguments required.
  466.     local IFS=${NO_WSP}
  467.     local -a _fta_tmp_
  468.     _fta_tmp_=( $(cat $1) )
  469.     eval $2=\( \$\{_fta_tmp_\[@\]\} \)
  470.     return 0
  471. }
  472.  
  473. #  Columnized print of an array of multi-field strings.
  474. #  col_print <array_name> <min_space> <
  475. #+ tab_stop [tab_stops]>
  476. col_print() {
  477.     [ $# -gt 2 ] || return 0
  478.     local -a _cp_inp
  479.     local -a _cp_spc
  480.     local -a _cp_line
  481.     local _cp_min
  482.     local _cp_mcnt
  483.     local _cp_pos
  484.     local _cp_cnt
  485.     local _cp_tab
  486.     local -i _cp
  487.     local -i _cpf
  488.     local _cp_fld
  489.     # WARNING: FOLLOWING LINE NOT BLANK -- IT IS QUOTED SPACES.
  490.     local _cp_max='                                                            '
  491.     set -f
  492.     local IFS=${NO_WSP}
  493.     eval _cp_inp=\(\ \$\{$1\[@\]\}\ \)
  494.     [ ${#_cp_inp[@]} -gt 0 ] || return 0 # Empty is easy.
  495.     _cp_mcnt=$2
  496.     _cp_min=${_cp_max:1:${_cp_mcnt}}
  497.     shift
  498.     shift
  499.     _cp_cnt=$#
  500.     for (( _cp = 0 ; _cp < _cp_cnt ; _cp++ ))
  501.     do
  502.         _cp_spc[${#_cp_spc[@]}]="${_cp_max:2:$1}" #"
  503.         shift
  504.     done
  505.     _cp_cnt=${#_cp_inp[@]}
  506.     for (( _cp = 0 ; _cp < _cp_cnt ; _cp++ ))
  507.     do
  508.         _cp_pos=1
  509.         IFS=${NO_WSP}$'\x20'
  510.         _cp_line=( ${_cp_inp[${_cp}]} )
  511.         IFS=${NO_WSP}
  512.         for (( _cpf = 0 ; _cpf < ${#_cp_line[@]} ; _cpf++ ))
  513.         do
  514.             _cp_tab=${_cp_spc[${_cpf}]:${_cp_pos}}
  515.             if [ ${#_cp_tab} -lt ${_cp_mcnt} ]
  516.             then
  517.                 _cp_tab="${_cp_min}"
  518.             fi
  519.             echo -n "${_cp_tab}"
  520.             (( _cp_pos = ${_cp_pos} + ${#_cp_tab} ))
  521.             _cp_fld="${_cp_line[${_cpf}]}"
  522.             echo -n ${_cp_fld}
  523.             (( _cp_pos = ${_cp_pos} + ${#_cp_fld} ))
  524.         done
  525.         echo
  526.     done
  527.     set +f
  528.     return 0
  529. }
  530.  
  531. # # # # 'Hunt the Spammer' data flow # # # #
  532.  
  533. # Application return code
  534. declare -i _hs_RC
  535.  
  536. # Original input, from which IP addresses are removed
  537. # After which, domain names to check
  538. declare -a uc_name
  539.  
  540. # Original input IP addresses are moved here
  541. # After which, IP addresses to check
  542. declare -a uc_address
  543.  
  544. # Names against which address expansion run
  545. # Ready for name detail lookup
  546. declare -a chk_name
  547.  
  548. # Addresses against which name expansion run
  549. # Ready for address detail lookup
  550. declare -a chk_address
  551.  
  552. #  Recursion is depth-first-by-name.
  553. #  The expand_input_address maintains this list
  554. #+ to prohibit looking up addresses twice during
  555. #+ domain name recursion.
  556. declare -a been_there_addr
  557. been_there_addr=( '127.0.0.1' ) # Whitelist localhost
  558.  
  559. # Names which we have checked (or given up on)
  560. declare -a known_name
  561.  
  562. # Addresses which we have checked (or given up on)
  563. declare -a known_address
  564.  
  565. #  List of zero or more Blacklist servers to check.
  566. #  Each 'known_address' will be checked against each server,
  567. #+ with negative replies and failures suppressed.
  568. declare -a list_server
  569.  
  570. # Indirection limit - set to zero == no limit
  571. indirect=${SPAMMER_LIMIT:=2}
  572.  
  573. # # # # 'Hunt the Spammer' information output data # # # #
  574.  
  575. # Any domain name may have multiple IP addresses.
  576. # Any IP address may have multiple domain names.
  577. # Therefore, track unique address-name pairs.
  578. declare -a known_pair
  579. declare -a reverse_pair
  580.  
  581. #  In addition to the data flow variables; known_address
  582. #+ known_name and list_server, the following are output to the
  583. #+ external graphics interface file.
  584.  
  585. # Authority chain, parent -> SOA fields.
  586. declare -a auth_chain
  587.  
  588. # Reference chain, parent name -> child name
  589. declare -a ref_chain
  590.  
  591. # DNS chain - domain name -> address
  592. declare -a name_address
  593.  
  594. # Name and service pairs - domain name -> service
  595. declare -a name_srvc
  596.  
  597. # Name and resource pairs - domain name -> Resource Record
  598. declare -a name_resource
  599.  
  600. # Parent and Child pairs - parent name -> child name
  601. # This MAY NOT be the same as the ref_chain followed!
  602. declare -a parent_child
  603.  
  604. # Address and Blacklist hit pairs - address->server
  605. declare -a address_hits
  606.  
  607. # Dump interface file data
  608. declare -f _dot_dump
  609. _dot_dump=pend_dummy   # Initially a no-op
  610.  
  611. #  Data dump is enabled by setting the environment variable SPAMMER_DATA
  612. #+ to the name of a writable file.
  613. declare _dot_file
  614.  
  615. # Helper function for the dump-to-dot-file function
  616. # dump_to_dot <array_name> <prefix>
  617. dump_to_dot() {
  618.     local -a _dda_tmp
  619.     local -i _dda_cnt
  620.     local _dda_form='    '${2}'%04u %s\n'
  621.     local IFS=${NO_WSP}
  622.     eval _dda_tmp=\(\ \$\{$1\[@\]\}\ \)
  623.     _dda_cnt=${#_dda_tmp[@]}
  624.     if [ ${_dda_cnt} -gt 0 ]
  625.     then
  626.         for (( _dda = 0 ; _dda < _dda_cnt ; _dda++ ))
  627.         do
  628.             printf "${_dda_form}" \
  629.                    "${_dda}" "${_dda_tmp[${_dda}]}" >>${_dot_file}
  630.         done
  631.     fi
  632. }
  633.  
  634. # Which will also set _dot_dump to this function . . .
  635. dump_dot() {
  636.     local -i _dd_cnt
  637.     echo '# Data vintage: '$(date -R) >${_dot_file}
  638.     echo '# ABS Guide: is_spammer.bash; v2, 2004-msz' >>${_dot_file}
  639.     echo >>${_dot_file}
  640.     echo 'digraph G {' >>${_dot_file}
  641.  
  642.     if [ ${#known_name[@]} -gt 0 ]
  643.     then
  644.         echo >>${_dot_file}
  645.         echo '# Known domain name nodes' >>${_dot_file}
  646.         _dd_cnt=${#known_name[@]}
  647.         for (( _dd = 0 ; _dd < _dd_cnt ; _dd++ ))
  648.         do
  649.             printf '    N%04u [label="%s"] ;\n' \
  650.                    "${_dd}" "${known_name[${_dd}]}" >>${_dot_file}
  651.         done
  652.     fi
  653.  
  654.     if [ ${#known_address[@]} -gt 0 ]
  655.     then
  656.         echo >>${_dot_file}
  657.         echo '# Known address nodes' >>${_dot_file}
  658.         _dd_cnt=${#known_address[@]}
  659.         for (( _dd = 0 ; _dd < _dd_cnt ; _dd++ ))
  660.         do
  661.             printf '    A%04u [label="%s"] ;\n' \
  662.                    "${_dd}" "${known_address[${_dd}]}" >>${_dot_file}
  663.         done
  664.     fi
  665.  
  666.     echo                                   >>${_dot_file}
  667.     echo '/*'                              >>${_dot_file}
  668.     echo ' * Known relationships :: User conversion to'  >>${_dot_file}
  669.     echo ' * graphic form by hand or program required.'  >>${_dot_file}
  670.     echo ' *'                              >>${_dot_file}
  671.  
  672.     if [ ${#auth_chain[@]} -gt 0 ]
  673.     then
  674.       echo >>${_dot_file}
  675.       echo '# Authority ref. edges followed & field source.' >>${_dot_file}
  676.         dump_to_dot auth_chain AC
  677.     fi
  678.  
  679.     if [ ${#ref_chain[@]} -gt 0 ]
  680.     then
  681.         echo >>${_dot_file}
  682.         echo '# Name ref. edges followed and field source.' >>${_dot_file}
  683.         dump_to_dot ref_chain RC
  684.     fi
  685.  
  686.     if [ ${#name_address[@]} -gt 0 ]
  687.     then
  688.         echo >>${_dot_file}
  689.         echo '# Known name->address edges' >>${_dot_file}
  690.         dump_to_dot name_address NA
  691.     fi
  692.  
  693.     if [ ${#name_srvc[@]} -gt 0 ]
  694.     then
  695.         echo >>${_dot_file}
  696.         echo '# Known name->service edges' >>${_dot_file}
  697.         dump_to_dot name_srvc NS
  698.     fi
  699.  
  700.     if [ ${#name_resource[@]} -gt 0 ]
  701.     then
  702.         echo >>${_dot_file}
  703.         echo '# Known name->resource edges' >>${_dot_file}
  704.         dump_to_dot name_resource NR
  705.     fi
  706.  
  707.     if [ ${#parent_child[@]} -gt 0 ]
  708.     then
  709.         echo >>${_dot_file}
  710.         echo '# Known parent->child edges' >>${_dot_file}
  711.         dump_to_dot parent_child PC
  712.     fi
  713.  
  714.     if [ ${#list_server[@]} -gt 0 ]
  715.     then
  716.         echo >>${_dot_file}
  717.         echo '# Known Blacklist nodes' >>${_dot_file}
  718.         _dd_cnt=${#list_server[@]}
  719.         for (( _dd = 0 ; _dd < _dd_cnt ; _dd++ ))
  720.         do
  721.             printf '    LS%04u [label="%s"] ;\n' \
  722.                    "${_dd}" "${list_server[${_dd}]}" >>${_dot_file}
  723.         done
  724.     fi
  725.  
  726.     unique_lines address_hits address_hits
  727.     if [ ${#address_hits[@]} -gt 0 ]
  728.     then
  729.       echo >>${_dot_file}
  730.       echo '# Known address->Blacklist_hit edges' >>${_dot_file}
  731.       echo '# CAUTION: dig warnings can trigger false hits.' >>${_dot_file}
  732.        dump_to_dot address_hits AH
  733.     fi
  734.     echo          >>${_dot_file}
  735.     echo ' *'     >>${_dot_file}
  736.     echo ' * That is a lot of relationships. Happy graphing.' >>${_dot_file}
  737.     echo ' */'    >>${_dot_file}
  738.     echo '}'      >>${_dot_file}
  739.     return 0
  740. }
  741.  
  742. # # # # 'Hunt the Spammer' execution flow # # # #
  743.  
  744. #  Execution trace is enabled by setting the
  745. #+ environment variable SPAMMER_TRACE to the name of a writable file.
  746. declare -a _trace_log
  747. declare _log_file
  748.  
  749. # Function to fill the trace log
  750. trace_logger() {
  751.     _trace_log[${#_trace_log[@]}]=${_pend_current_}
  752. }
  753.  
  754. # Dump trace log to file function variable.
  755. declare -f _log_dump
  756. _log_dump=pend_dummy   # Initially a no-op.
  757.  
  758. # Dump the trace log to a file.
  759. dump_log() {
  760.     local -i _dl_cnt
  761.     _dl_cnt=${#_trace_log[@]}
  762.     for (( _dl = 0 ; _dl < _dl_cnt ; _dl++ ))
  763.     do
  764.         echo ${_trace_log[${_dl}]} >> ${_log_file}
  765.     done
  766.     _dl_cnt=${#_pending_[@]}
  767.     if [ ${_dl_cnt} -gt 0 ]
  768.     then
  769.         _dl_cnt=${_dl_cnt}-1
  770.         echo '# # # Operations stack not empty # # #' >> ${_log_file}
  771.         for (( _dl = ${_dl_cnt} ; _dl >= 0 ; _dl-- ))
  772.         do
  773.             echo ${_pending_[${_dl}]} >> ${_log_file}
  774.         done
  775.     fi
  776. }
  777.  
  778. # # # Utility program 'dig' wrappers # # #
  779. #
  780. #  These wrappers are derived from the
  781. #+ examples shown in dig_wrappers.bash.
  782. #
  783. #  The major difference is these return
  784. #+ their results as a list in an array.
  785. #
  786. #  See dig_wrappers.bash for details and
  787. #+ use that script to develop any changes.
  788. #
  789. # # #
  790.  
  791. # Short form answer: 'dig' parses answer.
  792.  
  793. # Forward lookup :: Name -> Address
  794. # short_fwd <domain_name> <array_name>
  795. short_fwd() {
  796.     local -a _sf_reply
  797.     local -i _sf_rc
  798.     local -i _sf_cnt
  799.     IFS=${NO_WSP}
  800. echo -n '.'
  801. # echo 'sfwd: '${1}
  802.   _sf_reply=( $(dig +short ${1} -c in -t a 2>/dev/null) )
  803.   _sf_rc=$?
  804.   if [ ${_sf_rc} -ne 0 ]
  805.   then
  806.     _trace_log[${#_trace_log[@]}]='## Lookup error '${_sf_rc}' on '${1}' ##'
  807. # [ ${_sf_rc} -ne 9 ] && pend_drop
  808.         return ${_sf_rc}
  809.     else
  810.         # Some versions of 'dig' return warnings on stdout.
  811.         _sf_cnt=${#_sf_reply[@]}
  812.         for (( _sf = 0 ; _sf < ${_sf_cnt} ; _sf++ ))
  813.         do
  814.             [ 'x'${_sf_reply[${_sf}]:0:2} == 'x;;' ] &&
  815.                 unset _sf_reply[${_sf}]
  816.         done
  817.         eval $2=\( \$\{_sf_reply\[@\]\} \)
  818.     fi
  819.     return 0
  820. }
  821.  
  822. # Reverse lookup :: Address -> Name
  823. # short_rev <ip_address> <array_name>
  824. short_rev() {
  825.     local -a _sr_reply
  826.     local -i _sr_rc
  827.     local -i _sr_cnt
  828.     IFS=${NO_WSP}
  829. echo -n '.'
  830. # echo 'srev: '${1}
  831.   _sr_reply=( $(dig +short -x ${1} 2>/dev/null) )
  832.   _sr_rc=$?
  833.   if [ ${_sr_rc} -ne 0 ]
  834.   then
  835.     _trace_log[${#_trace_log[@]}]='## Lookup error '${_sr_rc}' on '${1}' ##'
  836. # [ ${_sr_rc} -ne 9 ] && pend_drop
  837.         return ${_sr_rc}
  838.     else
  839.         # Some versions of 'dig' return warnings on stdout.
  840.         _sr_cnt=${#_sr_reply[@]}
  841.         for (( _sr = 0 ; _sr < ${_sr_cnt} ; _sr++ ))
  842.         do
  843.             [ 'x'${_sr_reply[${_sr}]:0:2} == 'x;;' ] &&
  844.                 unset _sr_reply[${_sr}]
  845.         done
  846.         eval $2=\( \$\{_sr_reply\[@\]\} \)
  847.     fi
  848.     return 0
  849. }
  850.  
  851. # Special format lookup used to query blacklist servers.
  852. # short_text <ip_address> <array_name>
  853. short_text() {
  854.     local -a _st_reply
  855.     local -i _st_rc
  856.     local -i _st_cnt
  857.     IFS=${NO_WSP}
  858. # echo 'stxt: '${1}
  859.   _st_reply=( $(dig +short ${1} -c in -t txt 2>/dev/null) )
  860.   _st_rc=$?
  861.   if [ ${_st_rc} -ne 0 ]
  862.   then
  863.     _trace_log[${#_trace_log[@]}]='##Text lookup error '${_st_rc}' on '${1}'##'
  864. # [ ${_st_rc} -ne 9 ] && pend_drop
  865.         return ${_st_rc}
  866.     else
  867.         # Some versions of 'dig' return warnings on stdout.
  868.         _st_cnt=${#_st_reply[@]}
  869.         for (( _st = 0 ; _st < ${#_st_cnt} ; _st++ ))
  870.         do
  871.             [ 'x'${_st_reply[${_st}]:0:2} == 'x;;' ] &&
  872.                 unset _st_reply[${_st}]
  873.         done
  874.         eval $2=\( \$\{_st_reply\[@\]\} \)
  875.     fi
  876.     return 0
  877. }
  878.  
  879. # The long forms, a.k.a., the parse it yourself versions
  880.  
  881. # RFC 2782   Service lookups
  882. # dig +noall +nofail +answer _ldap._tcp.openldap.org -t srv
  883. # _<service>._<protocol>.<domain_name>
  884. # _ldap._tcp.openldap.org. 3600   IN     SRV    0 0 389 ldap.openldap.org.
  885. # domain TTL Class SRV Priority Weight Port Target
  886.  
  887. # Forward lookup :: Name -> poor man's zone transfer
  888. # long_fwd <domain_name> <array_name>
  889. long_fwd() {
  890.     local -a _lf_reply
  891.     local -i _lf_rc
  892.     local -i _lf_cnt
  893.     IFS=${NO_WSP}
  894. echo -n ':'
  895. # echo 'lfwd: '${1}
  896.   _lf_reply=( $(
  897.      dig +noall +nofail +answer +authority +additional \
  898.          ${1} -t soa ${1} -t mx ${1} -t any 2>/dev/null) )
  899.   _lf_rc=$?
  900.   if [ ${_lf_rc} -ne 0 ]
  901.   then
  902.     _trace_log[${#_trace_log[@]}]='# Zone lookup err '${_lf_rc}' on '${1}' #'
  903. # [ ${_lf_rc} -ne 9 ] && pend_drop
  904.         return ${_lf_rc}
  905.     else
  906.         # Some versions of 'dig' return warnings on stdout.
  907.         _lf_cnt=${#_lf_reply[@]}
  908.         for (( _lf = 0 ; _lf < ${_lf_cnt} ; _lf++ ))
  909.         do
  910.             [ 'x'${_lf_reply[${_lf}]:0:2} == 'x;;' ] &&
  911.                 unset _lf_reply[${_lf}]
  912.         done
  913.         eval $2=\( \$\{_lf_reply\[@\]\} \)
  914.     fi
  915.     return 0
  916. }
  917. #  The reverse lookup domain name corresponding to the IPv6 address:
  918. #      4321:0:1:2:3:4:567:89ab
  919. #  would be (nibble, I.E: Hexdigit) reversed:
  920. #  b.a.9.8.7.6.5.0.4.0.0.0.3.0.0.0.2.0.0.0.1.0.0.0.0.0.0.0.1.2.3.4.IP6.ARPA.
  921.  
  922. # Reverse lookup :: Address -> poor man's delegation chain
  923. # long_rev <rev_ip_address> <array_name>
  924. long_rev() {
  925.     local -a _lr_reply
  926.     local -i _lr_rc
  927.     local -i _lr_cnt
  928.     local _lr_dns
  929.     _lr_dns=${1}'.in-addr.arpa.'
  930.     IFS=${NO_WSP}
  931. echo -n ':'
  932. # echo 'lrev: '${1}
  933.   _lr_reply=( $(
  934.        dig +noall +nofail +answer +authority +additional \
  935.            ${_lr_dns} -t soa ${_lr_dns} -t any 2>/dev/null) )
  936.   _lr_rc=$?
  937.   if [ ${_lr_rc} -ne 0 ]
  938.   then
  939.     _trace_log[${#_trace_log[@]}]='# Deleg lkp error '${_lr_rc}' on '${1}' #'
  940. # [ ${_lr_rc} -ne 9 ] && pend_drop
  941.         return ${_lr_rc}
  942.     else
  943.         # Some versions of 'dig' return warnings on stdout.
  944.         _lr_cnt=${#_lr_reply[@]}
  945.         for (( _lr = 0 ; _lr < ${_lr_cnt} ; _lr++ ))
  946.         do
  947.             [ 'x'${_lr_reply[${_lr}]:0:2} == 'x;;' ] &&
  948.                 unset _lr_reply[${_lr}]
  949.         done
  950.         eval $2=\( \$\{_lr_reply\[@\]\} \)
  951.     fi
  952.     return 0
  953. }
  954.  
  955. # # # Application specific functions # # #
  956.  
  957. # Mung a possible name; suppresses root and TLDs.
  958. # name_fixup <string>
  959. name_fixup(){
  960.     local -a _nf_tmp
  961.     local -i _nf_end
  962.     local _nf_str
  963.     local IFS
  964.     _nf_str=$(to_lower ${1})
  965.     _nf_str=$(to_dot ${_nf_str})
  966.     _nf_end=${#_nf_str}-1
  967.     [ ${_nf_str:${_nf_end}} != '.' ] &&
  968.         _nf_str=${_nf_str}'.'
  969.     IFS=${ADR_IFS}
  970.     _nf_tmp=( ${_nf_str} )
  971.     IFS=${WSP_IFS}
  972.     _nf_end=${#_nf_tmp[@]}
  973.     case ${_nf_end} in
  974.     0) # No dots, only dots.
  975.         echo
  976.         return 1
  977.     ;;
  978.     1) # Only a TLD.
  979.         echo
  980.         return 1
  981.     ;;
  982.     2) # Maybe okay.
  983.        echo ${_nf_str}
  984.        return 0
  985.        # Needs a lookup table?
  986.        if [ ${#_nf_tmp[1]} -eq 2 ]
  987.        then # Country coded TLD.
  988.            echo
  989.            return 1
  990.        else
  991.            echo ${_nf_str}
  992.            return 0
  993.        fi
  994.     ;;
  995.     esac
  996.     echo ${_nf_str}
  997.     return 0
  998. }
  999.  
  1000. # Grope and mung original input(s).
  1001. split_input() {
  1002.     [ ${#uc_name[@]} -gt 0 ] || return 0
  1003.     local -i _si_cnt
  1004.     local -i _si_len
  1005.     local _si_str
  1006.     unique_lines uc_name uc_name
  1007.     _si_cnt=${#uc_name[@]}
  1008.     for (( _si = 0 ; _si < _si_cnt ; _si++ ))
  1009.     do
  1010.         _si_str=${uc_name[$_si]}
  1011.         if is_address ${_si_str}
  1012.         then
  1013.             uc_address[${#uc_address[@]}]=${_si_str}
  1014.             unset uc_name[$_si]
  1015.         else
  1016.             if ! uc_name[$_si]=$(name_fixup ${_si_str})
  1017.             then
  1018.                 unset ucname[$_si]
  1019.             fi
  1020.         fi
  1021.     done
  1022.   uc_name=( ${uc_name[@]} )
  1023.   _si_cnt=${#uc_name[@]}
  1024.   _trace_log[${#_trace_log[@]}]='#Input '${_si_cnt}' unchkd name input(s).#'
  1025.   _si_cnt=${#uc_address[@]}
  1026.   _trace_log[${#_trace_log[@]}]='#Input '${_si_cnt}' unchkd addr input(s).#'
  1027.     return 0
  1028. }
  1029.  
  1030. # # # Discovery functions -- recursively interlocked by external data # # #
  1031. # # # The leading 'if list is empty; return 0' in each is required. # # #
  1032.  
  1033. # Recursion limiter
  1034. # limit_chk() <next_level>
  1035. limit_chk() {
  1036.     local -i _lc_lmt
  1037.     # Check indirection limit.
  1038.     if [ ${indirect} -eq 0 ] || [ $# -eq 0 ]
  1039.     then
  1040.         # The 'do-forever' choice
  1041.         echo 1                 # Any value will do.
  1042.         return 0               # OK to continue.
  1043.     else
  1044.         # Limiting is in effect.
  1045.         if [ ${indirect} -lt ${1} ]
  1046.         then
  1047.             echo ${1}          # Whatever.
  1048.             return 1           # Stop here.
  1049.         else
  1050.             _lc_lmt=${1}+1     # Bump the given limit.
  1051.             echo ${_lc_lmt}    # Echo it.
  1052.             return 0           # OK to continue.
  1053.         fi
  1054.     fi
  1055. }
  1056.  
  1057. # For each name in uc_name:
  1058. #     Move name to chk_name.
  1059. #     Add addresses to uc_address.
  1060. #     Pend expand_input_address.
  1061. #     Repeat until nothing new found.
  1062. # expand_input_name <indirection_limit>
  1063. expand_input_name() {
  1064.     [ ${#uc_name[@]} -gt 0 ] || return 0
  1065.     local -a _ein_addr
  1066.     local -a _ein_new
  1067.     local -i _ucn_cnt
  1068.     local -i _ein_cnt
  1069.     local _ein_tst
  1070.     _ucn_cnt=${#uc_name[@]}
  1071.  
  1072.     if  ! _ein_cnt=$(limit_chk ${1})
  1073.     then
  1074.         return 0
  1075.     fi
  1076.  
  1077.     for (( _ein = 0 ; _ein < _ucn_cnt ; _ein++ ))
  1078.     do
  1079.         if short_fwd ${uc_name[${_ein}]} _ein_new
  1080.         then
  1081.           for (( _ein_cnt = 0 ; _ein_cnt < ${#_ein_new[@]}; _ein_cnt++ ))
  1082.           do
  1083.               _ein_tst=${_ein_new[${_ein_cnt}]}
  1084.               if is_address ${_ein_tst}
  1085.               then
  1086.                   _ein_addr[${#_ein_addr[@]}]=${_ein_tst}
  1087.               fi
  1088.     done
  1089.         fi
  1090.     done
  1091.     unique_lines _ein_addr _ein_addr     # Scrub duplicates.
  1092.     edit_exact chk_address _ein_addr     # Scrub pending detail.
  1093.     edit_exact known_address _ein_addr   # Scrub already detailed.
  1094.  if [ ${#_ein_addr[@]} -gt 0 ]        # Anything new?
  1095.  then
  1096.    uc_address=( ${uc_address[@]} ${_ein_addr[@]} )
  1097.    pend_func expand_input_address ${1}
  1098.    _trace_log[${#_trace_log[@]}]='#Add '${#_ein_addr[@]}' unchkd addr inp.#'
  1099.     fi
  1100.     edit_exact chk_name uc_name          # Scrub pending detail.
  1101.     edit_exact known_name uc_name        # Scrub already detailed.
  1102.     if [ ${#uc_name[@]} -gt 0 ]
  1103.     then
  1104.         chk_name=( ${chk_name[@]} ${uc_name[@]}  )
  1105.         pend_func detail_each_name ${1}
  1106.     fi
  1107.     unset uc_name[@]
  1108.     return 0
  1109. }
  1110.  
  1111. # For each address in uc_address:
  1112. #     Move address to chk_address.
  1113. #     Add names to uc_name.
  1114. #     Pend expand_input_name.
  1115. #     Repeat until nothing new found.
  1116. # expand_input_address <indirection_limit>
  1117. expand_input_address() {
  1118.     [ ${#uc_address[@]} -gt 0 ] || return 0
  1119.     local -a _eia_addr
  1120.     local -a _eia_name
  1121.     local -a _eia_new
  1122.     local -i _uca_cnt
  1123.     local -i _eia_cnt
  1124.     local _eia_tst
  1125.     unique_lines uc_address _eia_addr
  1126.     unset uc_address[@]
  1127.     edit_exact been_there_addr _eia_addr
  1128.     _uca_cnt=${#_eia_addr[@]}
  1129.     [ ${_uca_cnt} -gt 0 ] &&
  1130.         been_there_addr=( ${been_there_addr[@]} ${_eia_addr[@]} )
  1131.  
  1132.     for (( _eia = 0 ; _eia < _uca_cnt ; _eia++ ))
  1133.      do
  1134.        if short_rev ${_eia_addr[${_eia}]} _eia_new
  1135.        then
  1136.          for (( _eia_cnt = 0 ; _eia_cnt < ${#_eia_new[@]} ; _eia_cnt++ ))
  1137.          do
  1138.            _eia_tst=${_eia_new[${_eia_cnt}]}
  1139.            if _eia_tst=$(name_fixup ${_eia_tst})
  1140.            then
  1141.              _eia_name[${#_eia_name[@]}]=${_eia_tst}
  1142.        fi
  1143.      done
  1144.            fi
  1145.     done
  1146.     unique_lines _eia_name _eia_name     # Scrub duplicates.
  1147.     edit_exact chk_name _eia_name        # Scrub pending detail.
  1148.     edit_exact known_name _eia_name      # Scrub already detailed.
  1149.  if [ ${#_eia_name[@]} -gt 0 ]        # Anything new?
  1150.  then
  1151.    uc_name=( ${uc_name[@]} ${_eia_name[@]} )
  1152.    pend_func expand_input_name ${1}
  1153.    _trace_log[${#_trace_log[@]}]='#Add '${#_eia_name[@]}' unchkd name inp.#'
  1154.     fi
  1155.     edit_exact chk_address _eia_addr     # Scrub pending detail.
  1156.     edit_exact known_address _eia_addr   # Scrub already detailed.
  1157.     if [ ${#_eia_addr[@]} -gt 0 ]        # Anything new?
  1158.     then
  1159.         chk_address=( ${chk_address[@]} ${_eia_addr[@]} )
  1160.         pend_func detail_each_address ${1}
  1161.     fi
  1162.     return 0
  1163. }
  1164.  
  1165. # The parse-it-yourself zone reply.
  1166. # The input is the chk_name list.
  1167. # detail_each_name <indirection_limit>
  1168. detail_each_name() {
  1169.     [ ${#chk_name[@]} -gt 0 ] || return 0
  1170.     local -a _den_chk       # Names to check
  1171.     local -a _den_name      # Names found here
  1172.     local -a _den_address   # Addresses found here
  1173.     local -a _den_pair      # Pairs found here
  1174.     local -a _den_rev       # Reverse pairs found here
  1175.     local -a _den_tmp       # Line being parsed
  1176.     local -a _den_auth      # SOA contact being parsed
  1177.     local -a _den_new       # The zone reply
  1178.     local -a _den_pc        # Parent-Child gets big fast
  1179.     local -a _den_ref       # So does reference chain
  1180.     local -a _den_nr        # Name-Resource can be big
  1181.     local -a _den_na        # Name-Address
  1182.     local -a _den_ns        # Name-Service
  1183.     local -a _den_achn      # Chain of Authority
  1184.     local -i _den_cnt       # Count of names to detail
  1185.     local -i _den_lmt       # Indirection limit
  1186.     local _den_who          # Named being processed
  1187.     local _den_rec          # Record type being processed
  1188.     local _den_cont         # Contact domain
  1189.     local _den_str          # Fixed up name string
  1190.     local _den_str2         # Fixed up reverse
  1191.     local IFS=${WSP_IFS}
  1192.  
  1193.     # Local, unique copy of names to check
  1194.     unique_lines chk_name _den_chk
  1195.     unset chk_name[@]       # Done with globals.
  1196.  
  1197.     # Less any names already known
  1198.     edit_exact known_name _den_chk
  1199.     _den_cnt=${#_den_chk[@]}
  1200.  
  1201.     # If anything left, add to known_name.
  1202.     [ ${_den_cnt} -gt 0 ] &&
  1203.         known_name=( ${known_name[@]} ${_den_chk[@]} )
  1204.  
  1205.     # for the list of (previously) unknown names . . .
  1206.     for (( _den = 0 ; _den < _den_cnt ; _den++ ))
  1207.     do
  1208.         _den_who=${_den_chk[${_den}]}
  1209.         if long_fwd ${_den_who} _den_new
  1210.         then
  1211.             unique_lines _den_new _den_new
  1212.             if [ ${#_den_new[@]} -eq 0 ]
  1213.             then
  1214.                 _den_pair[${#_den_pair[@]}]='0.0.0.0 '${_den_who}
  1215.             fi
  1216.  
  1217.             # Parse each line in the reply.
  1218.             for (( _line = 0 ; _line < ${#_den_new[@]} ; _line++ ))
  1219.             do
  1220.                 IFS=${NO_WSP}$'\x09'$'\x20'
  1221.                 _den_tmp=( ${_den_new[${_line}]} )
  1222.                 IFS=${WSP_IFS}
  1223.               # If usable record and not a warning message . . .
  1224.               if [ ${#_den_tmp[@]} -gt 4 ] && [ 'x'${_den_tmp[0]} != 'x;;' ]
  1225.               then
  1226.                     _den_rec=${_den_tmp[3]}
  1227.                     _den_nr[${#_den_nr[@]}]=${_den_who}' '${_den_rec}
  1228.                     # Begin at RFC1033 (+++)
  1229.                     case ${_den_rec} in
  1230.  
  1231. #<name> [<ttl>]  [<class>] SOA <origin> <person>
  1232.                     SOA) # Start Of Authority
  1233.     if _den_str=$(name_fixup ${_den_tmp[0]})
  1234.     then
  1235.       _den_name[${#_den_name[@]}]=${_den_str}
  1236.       _den_achn[${#_den_achn[@]}]=${_den_who}' '${_den_str}' SOA'
  1237.       # SOA origin -- domain name of master zone record
  1238.       if _den_str2=$(name_fixup ${_den_tmp[4]})
  1239.       then
  1240.         _den_name[${#_den_name[@]}]=${_den_str2}
  1241.         _den_achn[${#_den_achn[@]}]=${_den_who}' '${_den_str2}' SOA.O'
  1242.       fi
  1243.       # Responsible party e-mail address (possibly bogus).
  1244.       # Possibility of first.last@domain.name ignored.
  1245.       set -f
  1246.       if _den_str2=$(name_fixup ${_den_tmp[5]})
  1247.       then
  1248.         IFS=${ADR_IFS}
  1249.         _den_auth=( ${_den_str2} )
  1250.         IFS=${WSP_IFS}
  1251.         if [ ${#_den_auth[@]} -gt 2 ]
  1252.         then
  1253.           _den_cont=${_den_auth[1]}
  1254.           for (( _auth = 2 ; _auth < ${#_den_auth[@]} ; _auth++ ))
  1255.           do
  1256.             _den_cont=${_den_cont}'.'${_den_auth[${_auth}]}
  1257.           done
  1258.           _den_name[${#_den_name[@]}]=${_den_cont}'.'
  1259.           _den_achn[${#_den_achn[@]}]=${_den_who}' '${_den_cont}'. SOA.C'
  1260.                                 fi
  1261.         fi
  1262.         set +f
  1263.                         fi
  1264.                     ;;
  1265.  
  1266.  
  1267.       A) # IP(v4) Address Record
  1268.       if _den_str=$(name_fixup ${_den_tmp[0]})
  1269.       then
  1270.         _den_name[${#_den_name[@]}]=${_den_str}
  1271.         _den_pair[${#_den_pair[@]}]=${_den_tmp[4]}' '${_den_str}
  1272.         _den_na[${#_den_na[@]}]=${_den_str}' '${_den_tmp[4]}
  1273.         _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' A'
  1274.       else
  1275.         _den_pair[${#_den_pair[@]}]=${_den_tmp[4]}' unknown.domain'
  1276.         _den_na[${#_den_na[@]}]='unknown.domain '${_den_tmp[4]}
  1277.         _den_ref[${#_den_ref[@]}]=${_den_who}' unknown.domain A'
  1278.       fi
  1279.       _den_address[${#_den_address[@]}]=${_den_tmp[4]}
  1280.       _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_tmp[4]}
  1281.              ;;
  1282.  
  1283.              NS) # Name Server Record
  1284.              # Domain name being serviced (may be other than current)
  1285.                if _den_str=$(name_fixup ${_den_tmp[0]})
  1286.                  then
  1287.                    _den_name[${#_den_name[@]}]=${_den_str}
  1288.                    _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' NS'
  1289.  
  1290.              # Domain name of service provider
  1291.              if _den_str2=$(name_fixup ${_den_tmp[4]})
  1292.              then
  1293.                _den_name[${#_den_name[@]}]=${_den_str2}
  1294.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str2}' NSH'
  1295.                _den_ns[${#_den_ns[@]}]=${_den_str2}' NS'
  1296.                _den_pc[${#_den_pc[@]}]=${_den_str}' '${_den_str2}
  1297.               fi
  1298.                fi
  1299.                     ;;
  1300.  
  1301.              MX) # Mail Server Record
  1302.                  # Domain name being serviced (wildcards not handled here)
  1303.              if _den_str=$(name_fixup ${_den_tmp[0]})
  1304.              then
  1305.                _den_name[${#_den_name[@]}]=${_den_str}
  1306.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' MX'
  1307.              fi
  1308.              # Domain name of service provider
  1309.              if _den_str=$(name_fixup ${_den_tmp[5]})
  1310.              then
  1311.                _den_name[${#_den_name[@]}]=${_den_str}
  1312.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' MXH'
  1313.                _den_ns[${#_den_ns[@]}]=${_den_str}' MX'
  1314.                _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_str}
  1315.              fi
  1316.                     ;;
  1317.  
  1318.              PTR) # Reverse address record
  1319.                   # Special name
  1320.              if _den_str=$(name_fixup ${_den_tmp[0]})
  1321.              then
  1322.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' PTR'
  1323.                # Host name (not a CNAME)
  1324.                if _den_str2=$(name_fixup ${_den_tmp[4]})
  1325.                then
  1326.                  _den_rev[${#_den_rev[@]}]=${_den_str}' '${_den_str2}
  1327.                  _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str2}' PTRH'
  1328.                  _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_str}
  1329.                fi
  1330.              fi
  1331.                     ;;
  1332.  
  1333.              AAAA) # IP(v6) Address Record
  1334.              if _den_str=$(name_fixup ${_den_tmp[0]})
  1335.              then
  1336.                _den_name[${#_den_name[@]}]=${_den_str}
  1337.                _den_pair[${#_den_pair[@]}]=${_den_tmp[4]}' '${_den_str}
  1338.                _den_na[${#_den_na[@]}]=${_den_str}' '${_den_tmp[4]}
  1339.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' AAAA'
  1340.                else
  1341.                  _den_pair[${#_den_pair[@]}]=${_den_tmp[4]}' unknown.domain'
  1342.                  _den_na[${#_den_na[@]}]='unknown.domain '${_den_tmp[4]}
  1343.                  _den_ref[${#_den_ref[@]}]=${_den_who}' unknown.domain'
  1344.                fi
  1345.                # No processing for IPv6 addresses
  1346.                _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_tmp[4]}
  1347.                     ;;
  1348.  
  1349.              CNAME) # Alias name record
  1350.                     # Nickname
  1351.              if _den_str=$(name_fixup ${_den_tmp[0]})
  1352.              then
  1353.                _den_name[${#_den_name[@]}]=${_den_str}
  1354.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' CNAME'
  1355.                _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_str}
  1356.              fi
  1357.                     # Hostname
  1358.              if _den_str=$(name_fixup ${_den_tmp[4]})
  1359.              then
  1360.                _den_name[${#_den_name[@]}]=${_den_str}
  1361.                _den_ref[${#_den_ref[@]}]=${_den_who}' '${_den_str}' CHOST'
  1362.                _den_pc[${#_den_pc[@]}]=${_den_who}' '${_den_str}
  1363.              fi
  1364.                     ;;
  1365. #            TXT)
  1366. #            ;;
  1367.                     esac
  1368.                 fi
  1369.             done
  1370.         else # Lookup error == 'A' record 'unknown address'
  1371.             _den_pair[${#_den_pair[@]}]='0.0.0.0 '${_den_who}
  1372.         fi
  1373.     done
  1374.  
  1375.     # Control dot array growth.
  1376.     unique_lines _den_achn _den_achn      # Works best, all the same.
  1377.     edit_exact auth_chain _den_achn       # Works best, unique items.
  1378.     if [ ${#_den_achn[@]} -gt 0 ]
  1379.     then
  1380.         IFS=${NO_WSP}
  1381.         auth_chain=( ${auth_chain[@]} ${_den_achn[@]} )
  1382.         IFS=${WSP_IFS}
  1383.     fi
  1384.  
  1385.     unique_lines _den_ref _den_ref      # Works best, all the same.
  1386.     edit_exact ref_chain _den_ref       # Works best, unique items.
  1387.     if [ ${#_den_ref[@]} -gt 0 ]
  1388.     then
  1389.         IFS=${NO_WSP}
  1390.         ref_chain=( ${ref_chain[@]} ${_den_ref[@]} )
  1391.         IFS=${WSP_IFS}
  1392.     fi
  1393.  
  1394.     unique_lines _den_na _den_na
  1395.     edit_exact name_address _den_na
  1396.     if [ ${#_den_na[@]} -gt 0 ]
  1397.     then
  1398.         IFS=${NO_WSP}
  1399.         name_address=( ${name_address[@]} ${_den_na[@]} )
  1400.         IFS=${WSP_IFS}
  1401.     fi
  1402.  
  1403.     unique_lines _den_ns _den_ns
  1404.     edit_exact name_srvc _den_ns
  1405.     if [ ${#_den_ns[@]} -gt 0 ]
  1406.     then
  1407.         IFS=${NO_WSP}
  1408.         name_srvc=( ${name_srvc[@]} ${_den_ns[@]} )
  1409.         IFS=${WSP_IFS}
  1410.     fi
  1411.  
  1412.     unique_lines _den_nr _den_nr
  1413.     edit_exact name_resource _den_nr
  1414.     if [ ${#_den_nr[@]} -gt 0 ]
  1415.     then
  1416.         IFS=${NO_WSP}
  1417.         name_resource=( ${name_resource[@]} ${_den_nr[@]} )
  1418.         IFS=${WSP_IFS}
  1419.     fi
  1420.  
  1421.     unique_lines _den_pc _den_pc
  1422.     edit_exact parent_child _den_pc
  1423.     if [ ${#_den_pc[@]} -gt 0 ]
  1424.     then
  1425.         IFS=${NO_WSP}
  1426.         parent_child=( ${parent_child[@]} ${_den_pc[@]} )
  1427.         IFS=${WSP_IFS}
  1428.     fi
  1429.  
  1430.     # Update list known_pair (Address and Name).
  1431.     unique_lines _den_pair _den_pair
  1432.     edit_exact known_pair _den_pair
  1433.     if [ ${#_den_pair[@]} -gt 0 ]  # Anything new?
  1434.     then
  1435.         IFS=${NO_WSP}
  1436.         known_pair=( ${known_pair[@]} ${_den_pair[@]} )
  1437.         IFS=${WSP_IFS}
  1438.     fi
  1439.  
  1440.     # Update list of reverse pairs.
  1441.     unique_lines _den_rev _den_rev
  1442.     edit_exact reverse_pair _den_rev
  1443.     if [ ${#_den_rev[@]} -gt 0 ]   # Anything new?
  1444.     then
  1445.         IFS=${NO_WSP}
  1446.         reverse_pair=( ${reverse_pair[@]} ${_den_rev[@]} )
  1447.         IFS=${WSP_IFS}
  1448.     fi
  1449.  
  1450.     # Check indirection limit -- give up if reached.
  1451.     if ! _den_lmt=$(limit_chk ${1})
  1452.     then
  1453.         return 0
  1454.     fi
  1455.  
  1456. # Execution engine is LIFO. Order of pend operations is important.
  1457. # Did we define any new addresses?
  1458. unique_lines _den_address _den_address    # Scrub duplicates.
  1459. edit_exact known_address _den_address     # Scrub already processed.
  1460. edit_exact un_address _den_address        # Scrub already waiting.
  1461. if [ ${#_den_address[@]} -gt 0 ]          # Anything new?
  1462. then
  1463.   uc_address=( ${uc_address[@]} ${_den_address[@]} )
  1464.   pend_func expand_input_address ${_den_lmt}
  1465.   _trace_log[${#_trace_log[@]}]='# Add '${#_den_address[@]}' unchkd addr. #'
  1466.     fi
  1467.  
  1468. # Did we find any new names?
  1469. unique_lines _den_name _den_name          # Scrub duplicates.
  1470. edit_exact known_name _den_name           # Scrub already processed.
  1471. edit_exact uc_name _den_name              # Scrub already waiting.
  1472. if [ ${#_den_name[@]} -gt 0 ]             # Anything new?
  1473. then
  1474.   uc_name=( ${uc_name[@]} ${_den_name[@]} )
  1475.   pend_func expand_input_name ${_den_lmt}
  1476.   _trace_log[${#_trace_log[@]}]='#Added '${#_den_name[@]}' unchkd name#'
  1477.     fi
  1478.     return 0
  1479. }
  1480.  
  1481. # The parse-it-yourself delegation reply
  1482. # Input is the chk_address list.
  1483. # detail_each_address <indirection_limit>
  1484. detail_each_address() {
  1485.     [ ${#chk_address[@]} -gt 0 ] || return 0
  1486.     unique_lines chk_address chk_address
  1487.     edit_exact known_address chk_address
  1488.     if [ ${#chk_address[@]} -gt 0 ]
  1489.     then
  1490.         known_address=( ${known_address[@]} ${chk_address[@]} )
  1491.         unset chk_address[@]
  1492.     fi
  1493.     return 0
  1494. }
  1495.  
  1496. # # # Application specific output functions # # #
  1497.  
  1498. # Pretty print the known pairs.
  1499. report_pairs() {
  1500.     echo
  1501.     echo 'Known network pairs.'
  1502.     col_print known_pair 2 5 30
  1503.  
  1504.     if [ ${#auth_chain[@]} -gt 0 ]
  1505.     then
  1506.         echo
  1507.         echo 'Known chain of authority.'
  1508.         col_print auth_chain 2 5 30 55
  1509.     fi
  1510.  
  1511.     if [ ${#reverse_pair[@]} -gt 0 ]
  1512.     then
  1513.         echo
  1514.         echo 'Known reverse pairs.'
  1515.         col_print reverse_pair 2 5 55
  1516.     fi
  1517.     return 0
  1518. }
  1519.  
  1520. # Check an address against the list of blacklist servers.
  1521. # A good place to capture for GraphViz: address->status(server(reports))
  1522. # check_lists <ip_address>
  1523. check_lists() {
  1524.     [ $# -eq 1 ] || return 1
  1525.     local -a _cl_fwd_addr
  1526.     local -a _cl_rev_addr
  1527.     local -a _cl_reply
  1528.     local -i _cl_rc
  1529.     local -i _ls_cnt
  1530.     local _cl_dns_addr
  1531.     local _cl_lkup
  1532.  
  1533.     split_ip ${1} _cl_fwd_addr _cl_rev_addr
  1534.     _cl_dns_addr=$(dot_array _cl_rev_addr)'.'
  1535.     _ls_cnt=${#list_server[@]}
  1536.     echo '    Checking address '${1}
  1537.     for (( _cl = 0 ; _cl < _ls_cnt ; _cl++ ))
  1538.     do
  1539.       _cl_lkup=${_cl_dns_addr}${list_server[${_cl}]}
  1540.       if short_text ${_cl_lkup} _cl_reply
  1541.       then
  1542.         if [ ${#_cl_reply[@]} -gt 0 ]
  1543.         then
  1544.           echo '        Records from '${list_server[${_cl}]}
  1545.           address_hits[${#address_hits[@]}]=${1}' '${list_server[${_cl}]}
  1546.           _hs_RC=2
  1547.           for (( _clr = 0 ; _clr < ${#_cl_reply[@]} ; _clr++ ))
  1548.           do
  1549.             echo '            '${_cl_reply[${_clr}]}
  1550.           done
  1551.         fi
  1552.       fi
  1553.     done
  1554.     return 0
  1555. }
  1556.  
  1557. # # # The usual application glue # # #
  1558.  
  1559. # Who did it?
  1560. credits() {
  1561.    echo
  1562.    echo 'Advanced Bash Scripting Guide: is_spammer.bash, v2, 2004-msz'
  1563. }
  1564.  
  1565. # How to use it?
  1566. # (See also, "Quickstart" at end of script.)
  1567. usage() {
  1568.     cat <<-'_usage_statement_'
  1569.     The script is_spammer.bash requires either one or two arguments.
  1570.  
  1571.     arg 1) May be one of:
  1572.         a) A domain name
  1573.         b) An IPv4 address
  1574.         c) The name of a file with any mix of names
  1575.            and addresses, one per line.
  1576.  
  1577.     arg 2) May be one of:
  1578.         a) A Blacklist server domain name
  1579.         b) The name of a file with Blacklist server
  1580.            domain names, one per line.
  1581.         c) If not present, a default list of (free)
  1582.            Blacklist servers is used.
  1583.         d) If a filename of an empty, readable, file
  1584.            is given,
  1585.            Blacklist server lookup is disabled.
  1586.  
  1587.     All script output is written to stdout.
  1588.  
  1589.     Return codes: 0 -> All OK, 1 -> Script failure,
  1590.                   2 -> Something is Blacklisted.
  1591.  
  1592.     Requires the external program 'dig' from the 'bind-9'
  1593.     set of DNS programs.  See: http://www.isc.org
  1594.  
  1595.     The domain name lookup depth limit defaults to 2 levels.
  1596.     Set the environment variable SPAMMER_LIMIT to change.
  1597.     SPAMMER_LIMIT=0 means 'unlimited'
  1598.  
  1599.     Limit may also be set on the command-line.
  1600.     If arg#1 is an integer, the limit is set to that value
  1601.     and then the above argument rules are applied.
  1602.  
  1603.     Setting the environment variable 'SPAMMER_DATA' to a filename
  1604.     will cause the script to write a GraphViz graphic file.
  1605.  
  1606.     For the development version;
  1607.     Setting the environment variable 'SPAMMER_TRACE' to a filename
  1608.     will cause the execution engine to log a function call trace.
  1609.  
  1610. _usage_statement_
  1611. }
  1612.  
  1613. # The default list of Blacklist servers:
  1614. # Many choices, see: http://www.spews.org/lists.html
  1615.  
  1616. declare -a default_servers
  1617. # See: http://www.spamhaus.org (Conservative, well maintained)
  1618. default_servers[0]='sbl-xbl.spamhaus.org'
  1619. # See: http://ordb.org (Open mail relays)
  1620. default_servers[1]='relays.ordb.org'
  1621. # See: http://www.spamcop.net/ (You can report spammers here)
  1622. default_servers[2]='bl.spamcop.net'
  1623. # See: http://www.spews.org (An 'early detect' system)
  1624. default_servers[3]='l2.spews.dnsbl.sorbs.net'
  1625. # See: http://www.dnsbl.us.sorbs.net/using.shtml
  1626. default_servers[4]='dnsbl.sorbs.net'
  1627. # See: http://dsbl.org/usage (Various mail relay lists)
  1628. default_servers[5]='list.dsbl.org'
  1629. default_servers[6]='multihop.dsbl.org'
  1630. default_servers[7]='unconfirmed.dsbl.org'
  1631.  
  1632. # User input argument #1
  1633. setup_input() {
  1634.     if [ -e ${1} ] && [ -r ${1} ]  # Name of readable file
  1635.     then
  1636.         file_to_array ${1} uc_name
  1637.         echo 'Using filename >'${1}'< as input.'
  1638.     else
  1639.         if is_address ${1}          # IP address?
  1640.         then
  1641.             uc_address=( ${1} )
  1642.             echo 'Starting with address >'${1}'<'
  1643.         else                       # Must be a name.
  1644.             uc_name=( ${1} )
  1645.             echo 'Starting with domain name >'${1}'<'
  1646.         fi
  1647.     fi
  1648.     return 0
  1649. }
  1650.  
  1651. # User input argument #2
  1652. setup_servers() {
  1653.     if [ -e ${1} ] && [ -r ${1} ]  # Name of a readable file
  1654.     then
  1655.         file_to_array ${1} list_server
  1656.         echo 'Using filename >'${1}'< as blacklist server list.'
  1657.     else
  1658.         list_server=( ${1} )
  1659.         echo 'Using blacklist server >'${1}'<'
  1660.     fi
  1661.     return 0
  1662. }
  1663.  
  1664. # User environment variable SPAMMER_TRACE
  1665. live_log_die() {
  1666.     if [ ${SPAMMER_TRACE:=} ]    # Wants trace log?
  1667.     then
  1668.         if [ ! -e ${SPAMMER_TRACE} ]
  1669.         then
  1670.             if ! touch ${SPAMMER_TRACE} 2>/dev/null
  1671.             then
  1672.                 pend_func echo $(printf '%q\n' \
  1673.                 'Unable to create log file >'${SPAMMER_TRACE}'<')
  1674.                 pend_release
  1675.                 exit 1
  1676.             fi
  1677.             _log_file=${SPAMMER_TRACE}
  1678.             _pend_hook_=trace_logger
  1679.             _log_dump=dump_log
  1680.         else
  1681.             if [ ! -w ${SPAMMER_TRACE} ]
  1682.             then
  1683.                 pend_func echo $(printf '%q\n' \
  1684.                 'Unable to write log file >'${SPAMMER_TRACE}'<')
  1685.                 pend_release
  1686.                 exit 1
  1687.             fi
  1688.             _log_file=${SPAMMER_TRACE}
  1689.             echo '' > ${_log_file}
  1690.             _pend_hook_=trace_logger
  1691.             _log_dump=dump_log
  1692.         fi
  1693.     fi
  1694.     return 0
  1695. }
  1696.  
  1697. # User environment variable SPAMMER_DATA
  1698. data_capture() {
  1699.     if [ ${SPAMMER_DATA:=} ]    # Wants a data dump?
  1700.     then
  1701.         if [ ! -e ${SPAMMER_DATA} ]
  1702.         then
  1703.             if ! touch ${SPAMMER_DATA} 2>/dev/null
  1704.             then
  1705.                 pend_func echo $(printf '%q]n' \
  1706.                 'Unable to create data output file >'${SPAMMER_DATA}'<')
  1707.                 pend_release
  1708.                 exit 1
  1709.             fi
  1710.             _dot_file=${SPAMMER_DATA}
  1711.             _dot_dump=dump_dot
  1712.         else
  1713.             if [ ! -w ${SPAMMER_DATA} ]
  1714.             then
  1715.                 pend_func echo $(printf '%q\n' \
  1716.                 'Unable to write data output file >'${SPAMMER_DATA}'<')
  1717.                 pend_release
  1718.                 exit 1
  1719.             fi
  1720.             _dot_file=${SPAMMER_DATA}
  1721.             _dot_dump=dump_dot
  1722.         fi
  1723.     fi
  1724.     return 0
  1725. }
  1726.  
  1727. # Grope user specified arguments.
  1728. do_user_args() {
  1729.     if [ $# -gt 0 ] && is_number $1
  1730.     then
  1731.         indirect=$1
  1732.         shift
  1733.     fi
  1734.  
  1735.     case $# in                     # Did user treat us well?
  1736.         1)
  1737.             if ! setup_input $1    # Needs error checking.
  1738.             then
  1739.                 pend_release
  1740.                 $_log_dump
  1741.                 exit 1
  1742.             fi
  1743.             list_server=( ${default_servers[@]} )
  1744.             _list_cnt=${#list_server[@]}
  1745.             echo 'Using default blacklist server list.'
  1746.             echo 'Search depth limit: '${indirect}
  1747.             ;;
  1748.         2)
  1749.             if ! setup_input $1    # Needs error checking.
  1750.             then
  1751.                 pend_release
  1752.                 $_log_dump
  1753.                 exit 1
  1754.             fi
  1755.             if ! setup_servers $2  # Needs error checking.
  1756.             then
  1757.                 pend_release
  1758.                 $_log_dump
  1759.                 exit 1
  1760.             fi
  1761.             echo 'Search depth limit: '${indirect}
  1762.             ;;
  1763.         *)
  1764.             pend_func usage
  1765.             pend_release
  1766.             $_log_dump
  1767.             exit 1
  1768.             ;;
  1769.     esac
  1770.     return 0
  1771. }
  1772.  
  1773. # A general purpose debug tool.
  1774. # list_array <array_name>
  1775. list_array() {
  1776.     [ $# -eq 1 ] || return 1  # One argument required.
  1777.  
  1778.     local -a _la_lines
  1779.     set -f
  1780.     local IFS=${NO_WSP}
  1781.     eval _la_lines=\(\ \$\{$1\[@\]\}\ \)
  1782.     echo
  1783.     echo "Element count "${#_la_lines[@]}" array "${1}
  1784.     local _ln_cnt=${#_la_lines[@]}
  1785.  
  1786.     for (( _i = 0; _i < ${_ln_cnt}; _i++ ))
  1787.     do
  1788.         echo 'Element '$_i' >'${_la_lines[$_i]}'<'
  1789.     done
  1790.     set +f
  1791.     return 0
  1792. }
  1793.  
  1794. # # # 'Hunt the Spammer' program code # # #
  1795. pend_init                               # Ready stack engine.
  1796. pend_func credits                       # Last thing to print.
  1797.  
  1798. # # # Deal with user # # #
  1799. live_log_die                            # Setup debug trace log.
  1800. data_capture                            # Setup data capture file.
  1801. echo
  1802. do_user_args $@
  1803.  
  1804. # # # Haven't exited yet - There is some hope # # #
  1805. # Discovery group - Execution engine is LIFO - pend
  1806. # in reverse order of execution.
  1807. _hs_RC=0                                # Hunt the Spammer return code
  1808. pend_mark
  1809.     pend_func report_pairs              # Report name-address pairs.
  1810.  
  1811.     # The two detail_* are mutually recursive functions.
  1812.     # They also pend expand_* functions as required.
  1813.     # These two (the last of ???) exit the recursion.
  1814.     pend_func detail_each_address       # Get all resources of addresses.
  1815.     pend_func detail_each_name          # Get all resources of names.
  1816.  
  1817.     #  The two expand_* are mutually recursive functions,
  1818.     #+ which pend additional detail_* functions as required.
  1819.     pend_func expand_input_address 1    # Expand input names by address.
  1820.     pend_func expand_input_name 1       # #xpand input addresses by name.
  1821.  
  1822.     # Start with a unique set of names and addresses.
  1823.     pend_func unique_lines uc_address uc_address
  1824.     pend_func unique_lines uc_name uc_name
  1825.  
  1826.     # Separate mixed input of names and addresses.
  1827.     pend_func split_input
  1828. pend_release
  1829.  
  1830. # # # Pairs reported -- Unique list of IP addresses found
  1831. echo
  1832. _ip_cnt=${#known_address[@]}
  1833. if [ ${#list_server[@]} -eq 0 ]
  1834. then
  1835.     echo 'Blacklist server list empty, none checked.'
  1836. else
  1837.     if [ ${_ip_cnt} -eq 0 ]
  1838.     then
  1839.         echo 'Known address list empty, none checked.'
  1840.     else
  1841.         _ip_cnt=${_ip_cnt}-1   # Start at top.
  1842.         echo 'Checking Blacklist servers.'
  1843.         for (( _ip = _ip_cnt ; _ip >= 0 ; _ip-- ))
  1844.         do
  1845.           pend_func check_lists $( printf '%q\n' ${known_address[$_ip]} )
  1846.         done
  1847.     fi
  1848. fi
  1849. pend_release
  1850. $_dot_dump                   # Graphics file dump
  1851. $_log_dump                   # Execution trace
  1852. echo
  1853.  
  1854.  
  1855. ##############################
  1856. # Example output from script #
  1857. ##############################
  1858. :<<-'_is_spammer_outputs_'
  1859.  
  1860. ./is_spammer.bash 0 web4.alojamentos7.com
  1861.  
  1862. Starting with domain name >web4.alojamentos7.com<
  1863. Using default blacklist server list.
  1864. Search depth limit: 0
  1865. .:....::::...:::...:::.......::..::...:::.......::
  1866. Known network pairs.
  1867.     66.98.208.97             web4.alojamentos7.com.
  1868.     66.98.208.97             ns1.alojamentos7.com.
  1869.     69.56.202.147            ns2.alojamentos.ws.
  1870.     66.98.208.97             alojamentos7.com.
  1871.     66.98.208.97             web.alojamentos7.com.
  1872.     69.56.202.146            ns1.alojamentos.ws.
  1873.     69.56.202.146            alojamentos.ws.
  1874.     66.235.180.113           ns1.alojamentos.org.
  1875.     66.235.181.192           ns2.alojamentos.org.
  1876.     66.235.180.113           alojamentos.org.
  1877.     66.235.180.113           web6.alojamentos.org.
  1878.     216.234.234.30           ns1.theplanet.com.
  1879.     12.96.160.115            ns2.theplanet.com.
  1880.     216.185.111.52           mail1.theplanet.com.
  1881.     69.56.141.4              spooling.theplanet.com.
  1882.     216.185.111.40           theplanet.com.
  1883.     216.185.111.40           www.theplanet.com.
  1884.     216.185.111.52           mail.theplanet.com.
  1885.  
  1886. Checking Blacklist servers.
  1887.   Checking address 66.98.208.97
  1888.       Records from dnsbl.sorbs.net
  1889.   "Spam Received See: http://www.dnsbl.sorbs.net/lookup.shtml?66.98.208.97"
  1890.     Checking address 69.56.202.147
  1891.     Checking address 69.56.202.146
  1892.     Checking address 66.235.180.113
  1893.     Checking address 66.235.181.192
  1894.     Checking address 216.185.111.40
  1895.     Checking address 216.234.234.30
  1896.     Checking address 12.96.160.115
  1897.     Checking address 216.185.111.52
  1898.     Checking address 69.56.141.4
  1899.  
  1900. Advanced Bash Scripting Guide: is_spammer.bash, v2, 2004-msz
  1901.  
  1902. _is_spammer_outputs_
  1903.  
  1904. exit ${_hs_RC}
  1905.  
  1906. ####################################################
  1907. #  The script ignores everything from here on down #
  1908. #+ because of the 'exit' command, just above.      #
  1909. ####################################################
  1910.  
  1911.  
  1912.  
  1913. Quickstart
  1914. ==========
  1915.  
  1916.  Prerequisites
  1917.  
  1918.   Bash version 2.05b or 3.00 (bash --version)
  1919.   A version of Bash which supports arrays. Array
  1920.   support is included by default Bash configurations.
  1921.  
  1922.   'dig,' version 9.x.x (dig $HOSTNAME, see first line of output)
  1923.   A version of dig which supports the +short options.
  1924.   See: dig_wrappers.bash for details.
  1925.  
  1926.  
  1927.  Optional Prerequisites
  1928.  
  1929.   'named,' a local DNS caching program. Any flavor will do.
  1930.   Do twice: dig $HOSTNAME
  1931.   Check near bottom of output for: SERVER: 127.0.0.1#53
  1932.   That means you have one running.
  1933.  
  1934.  
  1935.  Optional Graphics Support
  1936.  
  1937.   'date,' a standard *nix thing. (date -R)
  1938.  
  1939.   dot Program to convert graphic description file to a
  1940.   diagram. (dot -V)
  1941.   A part of the Graph-Viz set of programs.
  1942.   See: [http://www.research.att.com/sw/tools/graphviz||GraphViz]
  1943.  
  1944.   'dotty,' a visual editor for graphic description files.
  1945.   Also a part of the Graph-Viz set of programs.
  1946.  
  1947.  
  1948.  
  1949.  
  1950.  Quick Start
  1951.  
  1952. In the same directory as the is_spammer.bash script;
  1953. Do: ./is_spammer.bash
  1954.  
  1955.  Usage Details
  1956.  
  1957. 1. Blacklist server choices.
  1958.  
  1959.   (a) To use default, built-in list: Do nothing.
  1960.  
  1961.   (b) To use your own list:
  1962.  
  1963.     i. Create a file with a single Blacklist server
  1964.        domain name per line.
  1965.  
  1966.     ii. Provide that filename as the last argument to
  1967.         the script.
  1968.  
  1969.   (c) To use a single Blacklist server: Last argument
  1970.       to the script.
  1971.  
  1972.   (d) To disable Blacklist lookups:
  1973.  
  1974.     i. Create an empty file (touch spammer.nul)
  1975.        Your choice of filename.
  1976.  
  1977.     ii. Provide the filename of that empty file as the
  1978.         last argument to the script.
  1979.  
  1980. 2. Search depth limit.
  1981.  
  1982.   (a) To use the default value of 2: Do nothing.
  1983.  
  1984.   (b) To set a different limit:
  1985.       A limit of 0 means: no limit.
  1986.  
  1987.     i. export SPAMMER_LIMIT=1
  1988.        or whatever limit you want.
  1989.  
  1990.     ii. OR provide the desired limit as the first
  1991.        argument to the script.
  1992.  
  1993. 3. Optional execution trace log.
  1994.  
  1995.   (a) To use the default setting of no log output: Do nothing.
  1996.  
  1997.   (b) To write an execution trace log:
  1998.       export SPAMMER_TRACE=spammer.log
  1999.       or whatever filename you want.
  2000.  
  2001. 4. Optional graphic description file.
  2002.  
  2003.   (a) To use the default setting of no graphic file: Do nothing.
  2004.  
  2005.   (b) To write a Graph-Viz graphic description file:
  2006.       export SPAMMER_DATA=spammer.dot
  2007.       or whatever filename you want.
  2008.  
  2009. 5. Where to start the search.
  2010.  
  2011.   (a) Starting with a single domain name:
  2012.  
  2013.     i. Without a command-line search limit: First
  2014.        argument to script.
  2015.  
  2016.     ii. With a command-line search limit: Second
  2017.         argument to script.
  2018.  
  2019.   (b) Starting with a single IP address:
  2020.  
  2021.     i. Without a command-line search limit: First
  2022.        argument to script.
  2023.  
  2024.     ii. With a command-line search limit: Second
  2025.         argument to script.
  2026.  
  2027.   (c) Starting with (mixed) multiple name(s) and/or address(es):
  2028.       Create a file with one name or address per line.
  2029.       Your choice of filename.
  2030.  
  2031.     i. Without a command-line search limit: Filename as
  2032.        first argument to script.
  2033.  
  2034.     ii. With a command-line search limit: Filename as
  2035.         second argument to script.
  2036.  
  2037. 6. What to do with the display output.
  2038.  
  2039.   (a) To view display output on screen: Do nothing.
  2040.  
  2041.   (b) To save display output to a file: Redirect stdout to a filename.
  2042.  
  2043.   (c) To discard display output: Redirect stdout to /dev/null.
  2044.  
  2045. 7. Temporary end of decision making.
  2046.    press RETURN
  2047.    wait (optionally, watch the dots and colons).
  2048.  
  2049. 8. Optionally check the return code.
  2050.  
  2051.   (a) Return code 0: All OK
  2052.  
  2053.   (b) Return code 1: Script setup failure
  2054.  
  2055.   (c) Return code 2: Something was blacklisted.
  2056.  
  2057. 9. Where is my graph (diagram)?
  2058.  
  2059. The script does not directly produce a graph (diagram).
  2060. It only produces a graphic description file. You can
  2061. process the graphic descriptor file that was output
  2062. with the 'dot' program.
  2063.  
  2064. Until you edit that descriptor file, to describe the
  2065. relationships you want shown, all that you will get is
  2066. a bunch of labeled name and address nodes.
  2067.  
  2068. All of the script's discovered relationships are within
  2069. a comment block in the graphic descriptor file, each
  2070. with a descriptive heading.
  2071.  
  2072. The editing required to draw a line between a pair of
  2073. nodes from the information in the descriptor file may
  2074. be done with a text editor.
  2075.  
  2076. Given these lines somewhere in the descriptor file:
  2077.  
  2078. # Known domain name nodes
  2079.  
  2080. N0000 [label="guardproof.info."] ;
  2081.  
  2082. N0002 [label="third.guardproof.info."] ;
  2083.  
  2084.  
  2085.  
  2086. # Known address nodes
  2087.  
  2088. A0000 [label="61.141.32.197"] ;
  2089.  
  2090.  
  2091.  
  2092. /*
  2093.  
  2094. # Known name->address edges
  2095.  
  2096. NA0000 third.guardproof.info. 61.141.32.197
  2097.  
  2098.  
  2099.  
  2100. # Known parent->child edges
  2101.  
  2102. PC0000 guardproof.info. third.guardproof.info.
  2103.  
  2104. */
  2105.  
  2106. Turn that into the following lines by substituting node
  2107. identifiers into the relationships:
  2108.  
  2109. # Known domain name nodes
  2110.  
  2111. N0000 [label="guardproof.info."] ;
  2112.  
  2113. N0002 [label="third.guardproof.info."] ;
  2114.  
  2115.  
  2116.  
  2117. # Known address nodes
  2118.  
  2119. A0000 [label="61.141.32.197"] ;
  2120.  
  2121.  
  2122.  
  2123. # PC0000 guardproof.info. third.guardproof.info.
  2124.  
  2125. N0000->N0002 ;
  2126.  
  2127.  
  2128.  
  2129. # NA0000 third.guardproof.info. 61.141.32.197
  2130.  
  2131. N0002->A0000 ;
  2132.  
  2133.  
  2134.  
  2135. /*
  2136.  
  2137. # Known name->address edges
  2138.  
  2139. NA0000 third.guardproof.info. 61.141.32.197
  2140.  
  2141.  
  2142.  
  2143. # Known parent->child edges
  2144.  
  2145. PC0000 guardproof.info. third.guardproof.info.
  2146.  
  2147. */
  2148.  
  2149. Process that with the 'dot' program, and you have your
  2150. first network diagram.
  2151.  
  2152. In addition to the conventional graphic edges, the
  2153. descriptor file includes similar format pair-data that
  2154. describes services, zone records (sub-graphs?),
  2155. blacklisted addresses, and other things which might be
  2156. interesting to include in your graph. This additional
  2157. information could be displayed as different node
  2158. shapes, colors, line sizes, etc.
  2159.  
  2160. The descriptor file can also be read and edited by a
  2161. Bash script (of course). You should be able to find
  2162. most of the functions required within the
  2163. "is_spammer.bash" script.
  2164.  
  2165. # End Quickstart.
  2166.  
  2167.  
  2168.  
  2169. Additional Note
  2170. ========== ====
  2171.  
  2172. Michael Zick points out that there is a "makeviz.bash" interactive
  2173. Web site at rediris.es. Can't give the full URL, since this is not
  2174. a publically accessible site.
Add Comment
Please, Sign In to add comment