Advertisement
Fare9

Analysis of "ransomware CTF" by Fare9

Feb 27th, 2020
221
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Markdown 21.20 KB | None | 0 0

Analysis of "ransomware CTF" by Fare9

Here I will post the analysis of the binary posted in the course forum of "Malware Analysis & Engineering" of the UC3M. The challenge as explained had two parts:

  1. Obtain the license key of the malware
  2. Find a way to decrypt the file and obtain the flag

The analysis was done in radare2, but as I didn't save the project file, I will recreate it in IDA, in order to be able to post the code snippets from the assembly. Finally I will post a script that would decrypt the flag image.

Analysis in IDA

What we have as sample to analyze is an ELF binary of an architecture of 32-bits, so we could be able to execute it in any Linux Machine, as I didn't want to run it at the beginning I directly started with an analysis with a disassembler (so no problem with pressing F9).
If we are using radare2, we just have to do the next to start:

    $ r2 malware-linux
    > aaa 
    > afl

With this we analyze the binary, and finally we list all the functions. As we're able to see, this binary contains all the symbols so, all the functions have a meaningful name:

0x080485e0    1 33           entry0
0x08048570    1 6            sym.imp.__libc_start_main
0x08048620    4 43           sym.deregister_tm_clones
0x08048650    4 53           sym.register_tm_clones
0x08048690    3 30           entry.fini0
0x080486b0    4 43   -> 40   entry.init0
0x08048add    8 243          sym.rand_string
0x08048ef0    1 2            sym.__libc_csu_fini
0x08048610    1 4            sym.__x86.get_pc_thunk.bx
0x08048ef4    1 20           sym._fini
0x08048749    1 30           sym.font_red
0x080484e0    1 6            sym.imp.printf
0x08048c07   11 309          sym.encrypt
0x080486db    4 80           sym.string_xor
0x08048bd0    3 55           sym.rand_string_alloc
0x08048530    1 6            sym.imp.malloc
0x08048e90    4 93           sym.__libc_csu_init
0x08048785    1 425          sym.print_logo
0x08048540    1 6            sym.imp.puts
0x08048a08   11 213          sym.check_serial
0x08048db5    4 205          main
0x0804872b    1 30           sym.clear_screen
0x0804892e   12 218          sym.my_getline
0x08048d3c    1 121          sym.print_banner
0x08048550    1 6            sym.imp.exit
0x08048767    1 30           sym.font_reset
0x080484a8    3 35           sym._init
0x080484f0    1 6            sym.imp.free
0x08048500    1 6            sym.imp.__stack_chk_fail
0x08048510    1 6            sym.imp.fread
0x08048520    1 6            sym.imp.realloc
0x08048560    1 6            sym.imp.strlen
0x08048580    1 6            sym.imp.fopen
0x08048590    1 6            sym.imp.fgetc
0x080485a0    1 6            sym.imp.rand
0x080485b0    1 6            sym.imp.fputc
0x080485c0    1 6            sym.imp.fputs

We have the main function, so we don't have to go through the entry function from the compiler. In radare2 to go to that function we just type the next:

    > s main
    > pdf

With this, we go to main function, and print the disassembly of that function. In opposite to IDA Pro, r2 didn't set a name to argv, so it just calls it "arg_4h", we could modify it, but as this is not a tutorial about radare, it's something you can do on your own.

The first 6 calls, just show a retro logo of a skrull, prints some information for the user to know what the program is about, and waits for a intro of the user:

    .text:08048DC6 call    font_red
    .text:08048DCB call    clear_screen
    .text:08048DD0 call    print_logo
    .text:08048DD5 call    my_getline
    .text:08048DDA call    clear_screen
    .text:08048DDF call    print_banner

After that, we have another call to my_getline but this time the returned value in eax, will be saved in a variable that we will call "user_input", this "user_input" will be given as parameter to a function called check_serial that is the first part of this challenge:

    .text:08048DE4 call    my_getline
    .text:08048DE9 mov     [ebp+user_input], eax
    .text:08048DEC sub     esp, 0Ch
    .text:08048DEF push    [ebp+user_input]
    .text:08048DF2 call    check_serial

So here we can start with the analysis of the first part of the challenge, getting the serial:

Getting "ransomware" serial

Okay, so let's start the analysis of the function check_serial, at the beginning of this function we have a serie of moves to different variables of the function, one byte by one byte, this is not any obfuscation trick, more than an array, but disassemblers don't get this as an array, but as a number of variables, if we are in IDA, we can double click in the one called var_1C, then right click and in the option of "array" creates an array with that data.
Here is how disassemblers show it at first:

    .text:08048A1F mov     [ebp+var_1C], 27h
    .text:08048A23 mov     [ebp+var_1B], 31h
    .text:08048A27 mov     [ebp+var_1A], 20h
    .text:08048A2B mov     [ebp+var_19], 1Ch
    .text:08048A2F mov     [ebp+var_18], 1Ah
    .text:08048A33 mov     [ebp+var_17], 8
    .text:08048A37 mov     [ebp+var_16], 4
    .text:08048A3B mov     [ebp+var_15], 18h
    .text:08048A3F mov     [ebp+var_14], 5Ch
    .text:08048A43 mov     [ebp+var_13], 0
    .text:08048A47 mov     [ebp+var_12], 31h
    .text:08048A4B mov     [ebp+var_11], 18h
    .text:08048A4F mov     [ebp+var_10], 44h
    .text:08048A53 mov     [ebp+var_F], 3
    .text:08048A57 mov     [ebp+var_E], 17h
    .text:08048A5B mov     [ebp+var_D], 0Eh

And now after changing all those variables for an array, and renaming it:

    .text:08048A1F mov     [ebp+constant_values], 27h
    .text:08048A23 mov     [ebp+constant_values+1], 31h
    .text:08048A27 mov     [ebp+constant_values+2], 20h
    .text:08048A2B mov     [ebp+constant_values+3], 1Ch
    .text:08048A2F mov     [ebp+constant_values+4], 1Ah
    .text:08048A33 mov     [ebp+constant_values+5], 8
    .text:08048A37 mov     [ebp+constant_values+6], 4
    .text:08048A3B mov     [ebp+constant_values+7], 18h
    .text:08048A3F mov     [ebp+constant_values+8], 5Ch
    .text:08048A43 mov     [ebp+constant_values+9], 0
    .text:08048A47 mov     [ebp+constant_values+0Ah], 31h
    .text:08048A4B mov     [ebp+constant_values+0Bh], 18h
    .text:08048A4F mov     [ebp+constant_values+0Ch], 44h
    .text:08048A53 mov     [ebp+constant_values+0Dh], 3
    .text:08048A57 mov     [ebp+constant_values+0Eh], 17h
    .text:08048A5B mov     [ebp+constant_values+0Fh], 0Eh

After this initialization code, we get the size that our serial should have:

    .text:08048A62 push    [ebp+user_input_cpy] ; s
    .text:08048A65 call    _strlen
    .text:08048A6A add     esp, 16
    .text:08048A6D mov     [ebp+counter], eax
    .text:08048A70 cmp     [ebp+counter], 10h
    .text:08048A74 jz      short _start_checks

Now the nuts and the bolts of the question here, is a loop that will check in some way our serial, probably with the constant values from before:

    .text:08048A7D _start_checks:                          
    .text:08048A7D                 mov     [ebp+counter], 0
    .text:08048A84                 jmp     short _check_of_lower_or_equals_than_0xf
    .text:08048A86 ; ---------------------------------------------------------------------------
    .text:08048A86
    .text:08048A86 _body_of_for_loop:                      
    .text:08048A86                 lea     edx, [ebp+constant_values]
    .text:08048A89                 mov     eax, [ebp+counter]
    .text:08048A8C                 add     eax, edx
    .text:08048A8E                 movzx   eax, byte ptr [eax]
    .text:08048A91                 mov     edx, eax
    .text:08048A93                 mov     eax, [ebp+counter]
    .text:08048A96                 add     eax, 64h
    .text:08048A99                 xor     eax, edx
    .text:08048A9B                 mov     [ebp+var_21], al
    .text:08048A9E                 movzx   edx, [ebp+var_21]
    .text:08048AA2                 mov     ecx, [ebp+counter]
    .text:08048AA5                 mov     eax, [ebp+user_input_cpy]
    .text:08048AA8                 add     eax, ecx
    .text:08048AAA                 movzx   eax, byte ptr [eax]
    .text:08048AAD                 movsx   eax, al
    .text:08048AB0                 cmp     edx, eax
    .text:08048AB2                 jz      short _increment_counter
    .text:08048AB4                 mov     eax, 0
    .text:08048AB9                 jmp     short _canary_check
    .text:08048ABB ; ---------------------------------------------------------------------------
    .text:08048ABB
    .text:08048ABB _increment_counter:                     
    .text:08048ABB                 add     [ebp+counter], 1
    .text:08048ABF
    .text:08048ABF _check_of_lower_or_equals_than_0xf:     
    .text:08048ABF                 cmp     [ebp+counter], 0Fh
    .text:08048AC3                 jle     short _body_of_for_loop
    .text:08048AC5                 mov     eax, 1

I did a pseudo-code in my notebook:

    bool check_serial(char *user_input)
    {
        uint8_t constant_values[] = {0x27, 0x31, 0x20, 0x1c, 0x1a, 0x8, 0x4, 0x18, 0x5c, 0x0, 0x31, 0x18, 0x44, 0x3, 0x17, 0xe};
        uint8_t byte_a, i, value;

        for (i = 0; i <= 0xf; i++)
        {
            byte_a = constant_values[i];
            value = i;
            value += 0x64;
            value ^= byte_a;
            if (value != user_input[i])
                return 0;
        }

        return 1;
    }

As we can see, inside of a loop from 0 to 0xf, one byte is calculated using as base a value from constant_values array, this value is increased by 0x64, then xored with the first value, that value is compared against one of our values given as input.
To extract the serial from here, we can write the same code, and instead of doing the comparison, we write a printf of each value.

At the end we will get the next serial: (go to spoiler section to discover it)

With this serial, we will be able to generate our encrypted files, this will be useful to once we are studying the encryption mechanism, and writing a decryptor, try it in our own files.

Let's continue, if some value of our input is incorrect, function will return 0, and we will go to a part of main function that shows the next messages: "NAH, that's not correct" and "Don't try to pirate our ransomware!".
In the other case, that the serial is correct, the program tells us to enter the name of the file to encrypt, string is taken against through a function of my_getline and finally the function encrypt will be called:

    .text:08048E11 push    offset aNiceThatSTheRi ; "Nice that's the right serial number"
    .text:08048E16 call    _puts
    .text:08048E1B add     esp, 10h
    .text:08048E1E sub     esp, 0Ch
    .text:08048E21 push    offset aEnterTheNameOf ; "Enter the name of the file you wish to "...
    .text:08048E26 call    _puts
    .text:08048E2B add     esp, 10h
    .text:08048E2E call    my_getline
    .text:08048E33 mov     [ebp+file_to_encrypt], eax
    .text:08048E36 sub     esp, 0Ch
    .text:08048E39 push    [ebp+file_to_encrypt]
    .text:08048E3C call    encrypt

Here it would come the second part of the analysis, the encryption routine that we have to analyze to get some decryption mechanism.

Reversing the encryption routine

Encryption function is a really short function, and is based in a "simple" encryption algorithm, so we will see the assembly of the algorithm, and finally some draws I'll try to do to understand how it works...

The function encryption starts opening the file given by the user in the command line, it will use fopen from the standard C, with the option "rb" to read the file as binary file. Then another file named "encrypted.out" will be open for writing as binary (easy to understand where the output will be written).

Now it comes one of the interesting parts of the code:

    .text:08048C63 push    20h
    .text:08048C65 call    rand_string_alloc
    .text:08048C6A add     esp, 10h
    .text:08048C6D mov     [ebp+IV], eax
    .text:08048C70 sub     esp, 8
    .text:08048C73 push    [ebp+fd_output] ; stream
    .text:08048C76 push    [ebp+IV]        ; s
    .text:08048C79 call    _fputs
    .text:08048C7E add     esp, 10h
    .text:08048C81 mov     [ebp+index], 0

The algorithm needs an IV to do the encryption, and this is generated as a pseudo-random string of characters, with a size of 0x20 (or 32 in decimal), this is written directly to the first 0x20 bytes of the output file, so with this they give us almost everything for the decryption, now let's start with the algorithm.

The encryption routine is a while based algorithm, that the condition is "while input file contains characters", and reads one by one, and encrypt it, here is the code:

    .text:08048C88                 jmp     short _start_while_you_can_read
    .text:08048C8A ; ---------------------------------------------------------------------------
    .text:08048C8A
    .text:08048C8A _encryption_mechanism:                  
    .text:08048C8A                 mov     edx, [ebp+index]
    .text:08048C8D                 mov     eax, [ebp+IV]
    .text:08048C90                 add     eax, edx
    .text:08048C92                 movzx   edx, byte ptr [eax]
    .text:08048C95                 movzx   eax, [ebp+input_character]
    .text:08048C99                 xor     eax, edx
    .text:08048C9B                 mov     [ebp+encrypted_character], al
    .text:08048C9E                 mov     edx, [ebp+index]
    .text:08048CA1                 mov     eax, [ebp+IV]
    .text:08048CA4                 add     edx, eax
    .text:08048CA6                 movzx   eax, [ebp+encrypted_character]
    .text:08048CAA                 mov     [edx], al
    .text:08048CAC                 movsx   eax, [ebp+encrypted_character]
    .text:08048CB0                 sub     esp, 8
    .text:08048CB3                 push    [ebp+fd_output] ; stream
    .text:08048CB6                 push    eax             ; c
    .text:08048CB7                 call    _fputc
    .text:08048CBC                 add     esp, 10h
    .text:08048CBF                 mov     eax, [ebp+index]
    .text:08048CC2                 add     eax, 1
    .text:08048CC5                 and     eax, 1Fh
    .text:08048CC8                 mov     [ebp+index], eax
    .text:08048CCB
    .text:08048CCB _start_while_you_can_read:              
    .text:08048CCB                 push    [ebp+fd_input]  ; stream
    .text:08048CCE                 push    1               ; n
    .text:08048CD0                 push    1               ; size
    .text:08048CD2                 lea     eax, [ebp+input_character]
    .text:08048CD5                 push    eax             ; ptr
    .text:08048CD6                 call    _fread
    .text:08048CDB                 add     esp, 10h
    .text:08048CDE                 test    eax, eax
    .text:08048CE0                 jnz     short _encryption_mechanism

Code like the next one:

    .text:08048C8A                 mov     edx, [ebp+index]
    .text:08048C8D                 mov     eax, [ebp+IV]
    .text:08048C90                 add     eax, edx
    .text:08048C92                 movzx   edx, byte ptr [eax]

It's what we saw at class, about accessing an index of an array, but instead of using something like:

    movzx edx, byte ptr [array_base + index * scale]

It uses an index, and the base, add those two values, and finally get its value.

The pseudocode of this would be the next one (again from my notebook):

    int encrypt(const char *input_file)
    {
        fd1 = open(input_file, 'rb')
        if (fd1 == 0)
            exit(1);
        fd2 = open("encrypted.out", 'wb')
        if (fd2 == 0)
            exit(1);

        IV = rand_string_alloc(0x20);

        while (read(fd1, &char, 1) != 1)
        {
            some_byte = IV[i];
            c = char ^ some_byte;
            header[i] = c;
            write(fd2, &c, 1);
            i++;
            i &= 0x1f; // this is to avoid values equals or greater than 0x20 for accessing the IV
        }
    }

If we try to draw the encryption algorithm, it would be something like:

        31                         0
        ----------------------------
        |            IV            |<---
        ----------------------------   |
                      |                |
                      |                |
                      |                |
 Plaintext            v                |      Encrypted
------------------->[XOR]------------------------>

As the first bytes of the encrypted text, depends on the IV, if we have the IV we can decrypt the firts 32 bytes of plaintext (with a simple xor), and as the second 32 bytes, depends on the first 32 encrypted bytes, if we have the first encrypted 32 bytes we could decrypt the second 32 bytes, and so on.
So, do we have the IV? Yes, because it's stored at the beginning of the file, do we have the first encrypted 32 bytes? yes, because we have to decrypt them, and so on until the end.
If we write the decryption it would be something like the next draw:

                    0                         31
                    ----------------------------
               ---->|            IV            |
               |    ----------------------------   
               |                  |                
               |                  |                
               |                  |                
 Encrypted     |                  v                   Plaintext
------------------------------->[XOR]------------------------>

With this, we will be able to decrypt the content following this schema, so now we will have to create a code in any language to do the decryption.

SPOILERS!!!! (Don't look at this part if you want to do it on your own)

C code for decryption

Here is the code that I've written to get the flag:

/***
 * Decryption tool for "malware-linux" file
 * of a CTF posted in Malware Analysis &
 * Engineering class forum.
 *
 * Written by:  Fare9
 */
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <string.h>
#include <unistd.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define ARRAY_SIZE 0x10

int
calculateLicenseArray(uint8_t *array);

int
decryptFile(const char* file_name, const char* output_name);

int
main(int argc, char **argv)
{
    uint8_t *user_array;
    uint8_t i;

    // same as calloc
    user_array = (uint8_t *)malloc(ARRAY_SIZE);

    memset(user_array, 0, ARRAY_SIZE);

    calculateLicenseArray(user_array);

    printf("So we have the string: ");

    for ( i = 0; i < ARRAY_SIZE; i++ )
    {
        putchar(user_array[i]);
    }

    printf("\n\n");

    free(user_array);

    if (argc > 2)
        decryptFile((const char *)argv[1], (const char *)argv[2]);

    return 0;
}

// It would be the function check_serial
int
calculateLicenseArray(uint8_t *array)
{
    uint8_t constant_values[] = {0x27, 0x31, 0x20, 0x1c, 0x1a, 0x8, 0x4, 0x18, 0x5c, 0x0, 0x31, 0x18, 0x44, 0x3, 0x17, 0xe};

    uint8_t value, aux, i;

    printf("Extracting values of license...\n");

    for (i = 0; i <= 0xf; i++)
    {
        aux = constant_values[i];
        value = i;
        value += 0x64;
        value ^= aux;
        // this value would be compared with user_array[i],
        // here we just print it
        printf("\tValue %d:\t%d(%c)\n", i, value, (char)value);
        array[i] = value; // store it in array for later
    }

    printf("\n\n");

    return 0;
}

int
decryptFile(const char* file_name, const char* output_name)
{
    uint8_t IV[0x20];

    uint8_t i = 0;

    int fd = open(file_name, O_RDONLY);
    int fd2 = open(output_name, O_CREAT | O_RDWR | O_TRUNC, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);

    char c;
    char aux;

    printf("Reading and decrypting: %s\n", file_name);
    printf("Writing decrypted content to: %s\n", output_name);

    if (read(fd, IV, 0x20) != 0x20)
    {
        printf("Not possible to read IV of the encryption\n");
        exit(1);
    }

    printf("Using next IV: [");

    /*
     * The first 0x20 bytes are the "IV" of
     * the encryption/decryption algorithm
     * and are stored at the beginning of
     * the file
     */
    for (i = 0; i < 0x20; i++)
        printf(" %02X ", IV[i]);

    printf("]\n");

    i = 0;

    while (read(fd, &c, 1) == 1)
    {
        //printf("Read: %d(%c)\t", c, c);
        aux = c ^ IV[i];
        //printf("Decrypted: %d(%c)\n", aux, aux);
        write(fd2, &aux, 1);
        /*
         * For the next rounds, we have to
         * update the IV, with previous encrypted
         * characters, as will be used together with
         * the encrypted char to decrypt the byte
         */

        IV[i] = c;
        i++;
        i &= 0x1f;
    }

    close(fd);
    close(fd2);

    return 0;

}

License and flag (MORE SPOILERS!!!!!!!!!!!!!)

License from the first part: CTF{rans0m_w4re}

Flag: CTF{C4NT_BE_BROKEN}

https://pasteboard.co/IWGUNZP.jpg

Advertisement
Add Comment
Please, Sign In to add comment
Advertisement