Port scan randomization

/*

 ABSTRACT

 I want to randomize TCP/IP port scans (like those with nmap) for large-scale
 scans of the entire Internet. I want to use a random number generator, rather
 than keeping track in memory, becuase the memory needed to scan the entire
 Internet would be too large.

 Using the "linear congruential generator" behind the typical 'rand()' function
 seems the obvious choice. Indeed, it's what many worms used (like Slammer and
 Witty) to scan the entire Internet. However, a typical 'rand()' function only
 works when the ranges are a power of 2. When using for odd sized ranges, I
 need different internal constants.

 That's what this program is for: given a range, choose the constants 'a' and
 'c' to randomize the range. I'll practice with this code manually to figure
 things out, then copy this code into a port scanner to make the process
 automated.


 BUILD

 It's just standard C, so all you need do is compile it on your favorite system.
 Though on GNU you may need to link to the math library.

 Mac:     gcc lcgtest.c -O3 -o lcgtest
 Ubuntu:  gcc lcgtest.c -O3 -o lggtest -lm
 Windows: I create a VisualStudio project and just compile it


 USAGE

 lcgtest <range>

 where <range> is a number like 100 or 0xffff11223344. Since the idea is
 to scan the entire Interent and all ports, the max size for the range
 can be up to a 48 bit number.


 MATH THEORY

 One way of randomizing a range is the "full cycle" technique:
 http://en.wikipedia.org/wiki/Full_cycle
 Given a range 'range' (like 100), we enumerate the new indexes with:

    next_index = (index + c) % range;

 1. The value of 'c' is relatively prime with 'range', meaning these two
    numbers have no prime factors in common. Thus, for a 'range' of 100
    addresses, you could choose '17' or '977', since they have no prime
    factors in common with '100'.

 This full cycle technique isn't random enough, which is why we look at
 a function like 'rand()'. This function is known as a Linear Congruential
 Generator, and is described here:
 http://en.wikipedia.org/wiki/Linear_congruential_generator

 This function looks like:

     next_index = (index * a + c) % range;

 The constants have the rules:

  1. The value 'c' is relatively prime with 'range', as before.
  2. The value (a - 1) is divisible by all prime factors of 'range'
  3. The value (a - 1) is a multiple of 4 if 'range' is a multiple of 4

 In other words, it's the same as before, but we add a multiplication along
 with the addition.


 PRIME FACTORS

 So, to find the constants 'a' and 'c' for a range, we need to do prime
 factorization. How do we do that efficiently? Curiously, I couldn't find
 answers to that. Everyone is concerned with the theoretical problem of
 factoring 1000 bit numbers, but all I want to do is factor a 48 bit number.

 It turns out that the best answer for small numbers is obvious answer: just
 take a list of primes and try them all one-by-one until you've found all the
 factors.

 How to get the list of primes starting from 2, 3, 5, 7, 11, 17, 19, ...? The
 answer is to use the well-known "Sieve of Eratosthenes". This turns out to
 be the most expensive part of the code, taking 2 seconds of compute power on
 an Atom 1.6 Ghz x86 CPU and 2-megabytes of RAM. I could precompute them, but
 then I'd have to ship around a 5-megabyte file of integers along with the
 program.

 An important note is that I don't have to factor a 48 bit number, but only
 a 24 bit number. A 48 bit number can have at most one prime factor larger
 than 24 bits, and once I've found all the other factors smaller than 24 bits,
 this larger factor will be the only one left. Thus, I use the sieve to go
 through 24 bits or 16-million combinations, which can be done in 2-megabytes
 of RAM. I think a big reason the Atom is slow is that this can't fit in the
 cache.


 SO WHAT THIS PROGRAM DOES

 Given a 'range', this program does:
 1. sieves it to find all prime numbers smaller than the sqrt(range).
 2. finds all the prime factors of 'range'
 3. calculates a value of 'a' and 'c' that meat the requirements for 'rand()'
 4. for small ranges, verifies that 1-to-1 really works
 5. prints the first few random numbers output, so that we can see how random
    they are visually.

 NOTES

 If I google harder, I might come up with a better way of factoring large
 numbers. For example, instead of the basic Sieve of Eratosthenes, I could
 instead use DJB's "primegen". On a Raspberry Pi (400 MHz ARM), primegen
 is about 10 times faster factoring small numbers, and uses less memory.
 */
#include <ctype.h>
#include <string.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <time.h>
#include <math.h>


/* A 64-bit number cannot have more than 16 prime factors */
typedef unsigned PRIMEFACTORS[16];

#if defined(_MSC_VER)
#define strtoll _strtoi64
#endif

/****************************************************************************
 * A object for creating a list of prime numbers, then using those primes
 * to find the prime factors of a number.
 *
 * To find the first primes up to 'n', we use the Sieve of Eratosthenes
 * algorithm. It's the slowest part of the whole thing.
 *
 * To find the prime factors of a number, we just brute force check every
 * prime number in the list.
 ****************************************************************************/
struct PrimeList
{
    unsigned char *list;
    unsigned count;
    unsigned max;
};


/****************************************************************************
 * When running the Sieve of Eratosthenes, mark a number as not being prime.
 * Numbers are represented as a bit mask, so clear the bit representing
 * that number.
 *
 * @param list
 *        A bit mask of bits representing all numbers from 0 to 'n'.
 * @param num
 *        The number we are checking to see if it's prime or not.
 ****************************************************************************/
static __inline void
sieve_clear(unsigned char *list, unsigned num)
{
    unsigned bit = num & 0x7;
    unsigned index = num / 8;

    list[index] &= ~(1 << bit);
}


/****************************************************************************
 * Test a number to see if it's in our list of primes. Numbers are
 * represented with a bit mask, so if it's prime, the bit is set, and if
 * it's not prime, the corresponding bit is clear. Finding the next prime
 * in the list means skip the bits that are clear.
 *
 * @param list
 *        A bit mask of bits representing all numbers from 0 to 'n'.
 * @param num
 *        The number we are checking to see if it's prime or not.
 ****************************************************************************/
static __inline unsigned
sieve_is_clear(const unsigned char *list, unsigned num)
{
    unsigned bit = num & 0x7;
    unsigned index = num / 8;

    return (list[index] & (1 << bit)) == 0;
}


/****************************************************************************
 * Create an object for dealing with a list of first primes.
 *
 * We do this creation as a separate step because we want to also benchmark
 * the actual seiving step. That's because seiving up to 24 bit numbers can
 * take several seconds on low-power processors.
 *
 * @param largest_number
 *        Find all primes up to this number.
 ****************************************************************************/
struct PrimeList *
primes_create(unsigned largest_number)
{
    struct PrimeList *primes;

    /* allocate the main object memory */
    primes = (struct PrimeList *)malloc(sizeof(*primes));
    memset(primes, 0, sizeof(*primes));

    /* Set the largest number we are going to look for */
    primes->max = largest_number;

    /* Allocate the sieve bit-mask. This uses one bit for each candidate
     * number. In other words, if we are looking for all primes up to
     * 100, then we'll need 12.5 bytes to store one bit for each number */
    primes->count = (largest_number+7)/8;
    primes->list = (unsigned char *)malloc(primes->count);

    /* Set to all 1. As we sieve the primes, we'll clear the bits for
     * numbers that aren't prime. */
    memset(primes->list, 0xFF, primes->count);

    /* 0 and 1 are not prime */
    sieve_clear(primes->list, 0);
    sieve_clear(primes->list, 1);

    /* the padding bits at the end aren't prime (well, if they are, we
     * don't care) */
    while (largest_number+1 < primes->count*8) {
        sieve_clear(primes->list, largest_number+1);
        largest_number++;
    }

    return primes;
}

/****************************************************************************
 * Clean up the object created by 'primes_create()'.
 ****************************************************************************/
void
primes_destroy(struct PrimeList *primes)
{
    if (primes == NULL)
        return;
    if (primes->list)
        free(primes->list);
    free(primes);
}


/****************************************************************************
 * Run the "Sieve of Eratosthenes" to find all prime numbers up to 'n'. The
 * value of 'n' was set when we initialized the object.
 *
 * The sieving process works by doing:
 *  - remove all even numbers from the list of possible primes (except for 2)
 *  - remove all multiples of 3 (except for 3 itself)
 *  - remove all multiples of 5 (except for 5 itself)
 *  - remove all multiples of 7 (except for 7 itself)
 *  - ... and so on for all prime numbers
 *
 ****************************************************************************/
void
primes_run_sieve(struct PrimeList *primes)
{
    unsigned char *list = primes->list;
    unsigned max = primes->max;
    unsigned prime = 2;

    /* for all primes ... */
    for (prime = 2; prime < max; ) {
        unsigned j;

        /* clear all bits representing multiples of this prime number*/
        for (j = prime*2;  j < max;  j += prime)
            sieve_clear(list, j);

        /* goto next prime number */
        prime++;
        while (prime < max && sieve_is_clear(list, prime))
            prime++;
    }

}


/****************************************************************************
 * For enumerating the primes in the list of primes after we've done
 * a 'run'.
 *
 * @param primes
 *        A list of prime numbers (that we probably created with the Sieve)
 * @param prime
 *        A prime number, like 2, 5, 17, 1699999
 * @return
 *        the next prime in the list. Given a prime number like '2', this
 *        function returns '3'.
 ****************************************************************************/
unsigned
primes_next(const struct PrimeList *primes, unsigned prime)
{
    const unsigned char *list = primes->list;
    unsigned max = primes->max;

    /* skip cleared bits representing non-primes until we get to the first
     * set bit representing a prime number */
    prime++;
    while (prime < max && sieve_is_clear(list, prime))
        prime++;
    return prime;
}

/****************************************************************************
 * Count the number of primes we've found. This just brute-force counts all
 * the bits set in the bit-mask.
 ****************************************************************************/
unsigned
primes_count(const struct PrimeList *primes)
{
    const unsigned char *list = primes->list;
    unsigned max = primes->count;
    unsigned i;
    unsigned count = 0;

    for (i=0; i<max; i++) {
        unsigned char c = list[i];
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1); c >>= 1;
        count += (c&1);
    }

    return count;
}


/****************************************************************************
 * Do the prime factorization of a number. This first runs the "Sieve of
 * Eratosthenes" to find all the small primes, then it does a brute-force
 * search of all the prime numbers in order to find the list of factors
 * of the number.
 *
 * @param m
 *      The number we are factoring.
 * @param factors
 *      The output of this function listing the prime factors of 'm'.
 *      If 'm' is prime, then 'm' will be the only member of this list.
 * @param elapsed
 *      Since the 'sieve' takes a long time to run, we can benchmark it,
 *      outputing the number of seconds taken. This is optional, if NULL,
 *      then benchmarking isn't done.
 ****************************************************************************/
void
primes_factor(uint64_t m, PRIMEFACTORS factors, double *elapsed)
{
    uint64_t root;
    struct PrimeList *primes;
    clock_t start, stop; /* for benchmark */

    /*
     * Calculate the square-root. That's because no prime factor can be
     * larger than the square-root of a number, except for the largest
     * one. In other words, all prime factors of '14' are less than the
     * square root (3.742), except for the largest prime factor (7).
     */
    root = (unsigned)sqrt(m+1.0);

    /*
     * Create a Seive of Erastothenes that finds all primes up to the
     * root, then tries them all to see if they are prime factors.
     */
    primes = primes_create((unsigned)root);

    /*
     * Run the sieve to find primes. This will take 0.2 seconds on a
     * Nehalem+ Intel x86 CPU for the full 48 bit range
     */
    start = clock();
    primes_run_sieve(primes);
    stop = clock();
    if (elapsed)
        *elapsed = (double)(stop - start)/(double)CLOCKS_PER_SEC;

    /*
     * Now we factor our original number.
     */
    {
        unsigned temp = (unsigned)m;
        unsigned prime;
        unsigned count = 0;

        memset(factors, 0, sizeof(factors));

        /* for all primes ... */
        for (prime = 2;
             prime < root;
             prime = primes_next(primes, prime)) {

            /* see if we have a prime factor */
            if ((temp % prime) != 0)
                continue; /* not a prime factor */

            /* add this factor to our list */
            factors[count++] = prime;

            /* remove this factor from our 'temp' variable */
            while ((temp % prime) == 0)
                temp /= prime;
        }

        /* the residual may also be a factor, in which case, add it
         * to our list as well */
        if (temp != 1)
            factors[count++] = temp;
    }

    /* free the memory we allocated for sieving the primes */
    primes_destroy(primes);
}


/****************************************************************************
 * Do a pseudo-random 1-to-1 translation of a number within a range to
 * another number in that range.
 *
 * The constants 'a' and 'c' must be chosen to match the LCG algorithm
 * to fit 'm' (range).
 *
 * This the same as the function 'rand()', except all the constants and
 * seeds are specified as parameters.
 *
 * @param index
 *      The index within the range that we are randomizing.
 * @param a
 *      The 'multiplier' of the LCG algorithm.
 * @param c
 *      The 'increment' of the LCG algorithm.
 * @param range
 *      The 'modulus' of the LCG algorithm.
 ****************************************************************************/
uint64_t
lcg(uint64_t index, uint64_t a, uint64_t c, uint64_t range)
{
    return (index * a + c) % range;
}


/****************************************************************************
 * Verify the LCG algorithm. You shouldn't do this for large ranges,
 * because we'll run out of memory. Therefore, this algorithm allocates
 * a buffer only up to a smaller range. We still have to traverse the
 * entire range of numbers, but we only need store values for a smaller
 * range. If 10% of the range checks out, then there's a good chance
 * it applies to the other 90% as well.
 *
 * This works by counting the results of rand(), which should be produced
 * exactly once.
 ****************************************************************************/
unsigned
lcg_verify(uint64_t a, uint64_t c, uint64_t range, uint64_t max)
{
    unsigned char *list;
    uint64_t i;
    unsigned is_success = 1;

    /* Allocate a list of 1-byte counters */
    list = (unsigned char *)malloc((range<max)?range:max);
    memset(list, 0, (range<max)?range:max);

    /* For all numbers in the range, verify increment the counter for the
     * the output. */
    for (i=0; i<range; i++) {
        uint64_t x = lcg(i, a, c, range);
        if (x < max)
            list[x]++;
    }

    /* Now check the output to make sure that every counter is set exactly
     * to the value of '1'. */
    for (i=0; i<max && i<range; i++) {
        if (list[i] != 1)
            is_success = 0;
    }

    free(list);

    return is_success;
}


/****************************************************************************
 * Count the number of digits in a number so that we can pretty-print a
 * bunch of numbers in nice columns.
 ****************************************************************************/
unsigned
count_digits(uint64_t num)
{
    unsigned result = 0;

    while (num) {
        result++;
        num /= 10;
    }

    return result;
}

/****************************************************************************
 * Tell whether the number has any prime factors in common with the list
 * of factors. In other words, if it's not coprime with the other number.
 * @param c
 *      The number we want to see has common factors with the other number.
 * @param factors
 *      The factors from the other number
 * @return
 *      !is_coprime(c, factors)
 ****************************************************************************/
unsigned
has_factors_in_common(uint64_t c, PRIMEFACTORS factors)
{
    unsigned i;

    for (i=0; factors[i]; i++) {
        if ((c % factors[i]) == 0)
            return factors[i]; /* found a common factor */
    }
    return 0; /* no factors in common */
}


/****************************************************************************
 ****************************************************************************/
int main(int argc, char *argv[])
{
    uint64_t m=2;   /* LCG modulus aka. the IP address range size */
    uint64_t a;     /* LCG multiplier */
    uint64_t c = 0; /* LCG increment */
    double elapsed = 0.0; /* Benchmark of 'sieve' algorithm */
    uint64_t i;
    PRIMEFACTORS factors; /* List of prime factors of 'm' */
    uint64_t a_offset = 1;

    memset(factors, 0, sizeof(factors));

    /* usage info if nothing specified */
usage:
    if (argc < 2) {
        printf("usage:\n lcgtest <range> [-c num] [-a num]\n");
        printf("Finds constants for randomizing range using a typical "
               "linear congruent generator\n");
        printf("<range> may be up to 48 bits in size\n");
        printf("[-c num] suggests a 'c' constant\n");
        printf("[-a num] suggests a 'a' scaling factor\n");
        printf("Numbers may be decimal or hex\n");
        return 0;
    }

    /*
     * Get the input
     * Read in optional additional parameters, such as a starting value for
     * 'c' or a scaling factor for 'a'.
     */
    for (i=1; i<(unsigned)argc; i++) {
        if (isdigit(argv[i][0]&0xFF))
            m = strtoll(argv[1], 0, 0);
        else if (argv[i][0] == '-') {
            switch (tolower(argv[i][1]&0xFF)) {
               case 'a':
                    if (argv[i][2] == '\0')
                        a_offset = strtoll(argv[++i], 0, 0);
                    else
                        a_offset = strtoll(argv[i]+2, 0, 0);
                    break;
                case 'c':
                    if (argv[i][2] == '\0')
                        c = strtoll(argv[++i], 0, 0);
                    else
                        c = strtoll(argv[i]+2, 0, 0);
                    break;
                case 'h':
                case '?':
                    goto usage;
                case '-':
                    goto usage;
                default:
                    fprintf(stderr, "unknown parm: %s\n", argv[i]);
                    goto usage;
            }

        } else {
            fprintf(stderr, "unknown parm: %s\n", argv[i]);
            goto usage;
        }

    }

    /*
     * Find all the prime factors of the number. This step can take several
     * seconds for 48 bit numbers, which is why we benchmark how long it
     * takes.
     */
    primes_factor(m, factors, &elapsed);

    /*
     * Calculate the 'a-1' constant. It must share all the prime factors
     * with the range, and if the range is a multiple of 4, must also
     * be a multiple of 4
     */
    a = 1;
    for (i=0; factors[i]; i++)
        a = a * factors[i];
    if ((m % 4) == 0)
        a *= 2;
    a *= a_offset; /* maybe improve for better randomness */
    a += 1;


    /*
     * Calculate the 'c' constant. It must have no prime factors in
     * common with the range.
     */
    if (c == 0)
        c = 29752039458; /* something random */
    while (has_factors_in_common(c, factors))
        c++;


    /*
     * print the results
     */
    //printf("sizeof(int) = %llu-bits\n", (uint64_t)(sizeof(size_t)*8));
    printf("elapsed     = %5.3f-seconds\n", elapsed);
    printf("factors     = ");
    for (i=0; factors[i]; i++)
        printf("%u ", factors[i]);
    printf("%s\n", factors[0]?"":"(none)");
    printf("m           = %-24llu (0x%llx)\n", m, m);
    printf("a           = %-24llu (0x%llx)\n", a, a);
    printf("c           = %-24llu (0x%llx)\n", c, c);
    printf("c%%m         = %-24llu (0x%llx)\n", c%m, c%m);

    if (m < 1000000000) {
        if (lcg_verify(a, c+1, m, 280))
            printf("verify      = success\n");
        else
            printf("verify      = failure\n");
    } else {
        printf("verify      = too big to check\n");
    }


    /*
     * Print some first numbers. We use these to visually inspect whether
     * the results are random or not.
     */
    {
        unsigned count = 0;
        uint64_t x = 0;
        unsigned digits = count_digits(m);

        for (i=0; i<100 && i < m; i++) {
            x = lcg(x, a, c, m);
            count += printf("%*llu ", digits, x);
            if (count >= 70) {
                count = 0;
                printf("\n");
            }
        }
        printf("\n");
    }

    return 0;
}