- The aim of this assignment is to write an emulator for the ARM instruction set as well as implememnt an algorithm using this instruction set.
- An emulator aims to replicate the functionality of a real architecture. In this case, we want an executable that can take as input assembled programs and goes through them updating the relevant registers and memory.
- This is an exercise in applied programming: you will need to understand the concepts of processor design and the hardware/software interface, and use that understanding to construct your emulator. You should do all implementation in C and your code should be compiled cleanly and tested using the default version of GCC available in the lab before submission. Any error messages should be written to stderr. The only output should be that specified within the assignment description; you should not include, for example, debugging code that prints anything else to stdout or stderr.
- Remember that the general aim of these assignments is an understanding of the theory rather than just a programming exercise. Where there is some debate about the right way to proceed, the coursework generally demands you make an informed design decision and back it up with a reasonable argument described in your marksheet.
- Tasks
- Roughly what you have to do is:
- Write an emulator for the ARM machine. This should have a range of features to show the inner workings of the process of executing an assembly program.
- Write a bubble sort program in ARM assembly. The start of this file is provided, you have to fill in the blanks.
- Emulator and ARM
- Remember the ARM has:
- 16 registers, mostly general purpose, but with
- r13 = stack pointer, sp
- r14 = link register, lr
- r15 = program counter, pc
- These registers are 32 bits in size. Instructions encodings are given in the ARM ARM
- Your emulator should internally represent these registers as well as the other architecture's key components, namely main memory and the ALU.
- This emulator, as most emulators is not clever in the sense that it will execute any code that it is presented to it. The code is therefore assumed to be correct and we have no OS supervising anything. You should however use the memory layout that we have discussed in class. This imples you have e.g. a structure emulating main memory and dynamically divided into code, data, heap and stack sections.
- Remember that the PC is incremented by 4 before the current instruction is executed.
- Instructions to be supported
- The ARM instruction set is large, and you are not expected to implement all the instructions in the ISA. You are to implement the following subset:
- MOV, ADD, SUB, MUL, MLA, CMP, AND, EOR, ORR, BIC, B, BL, LDR, STR, SWI
- Also, implement the following pseudo-code instruction, which allocates space for the data value which follows it: DCD
- Remember that, in the ARM, each of the above instructions can take an optional shift (support for LSL and LSR only needed), and may involve registers or immediate values (constants).
- Implement the following condition codes: EQ, NE, PL, MI, GE, LT, GT, LE
- Implement the 'S', "Condition flags set" ability
- Little-endian format only
- Emulator features
- Depending on the stages you complete, your emulator should have the following features:
- "-trace" -> show instruction trace
- "-before" -> show memory dump before execution
- "-after" -> show memory dump after execution
- more on these is described below under "Output format"
- Debug interface
- Since the ARM has a lot of registers, it would be very cluttered if you displayed all registers every step during normal execution. The ARM ISA, however, supports debugging via the SVC instruction. You should configure your emulator so that, if it encounters a SVC instruction it performs various actions, as follows:
- SVC 0 -- Stop execution
- SVC 1 -- Print out all registers and their values on the console
- SVC 2 -- Print out the value in R0, followed by a carriage-return ( or new line)
- Input format
- The input to the emulator is a human readable file (ASCII), normally with extension ".emu" showing for each line a 32 bit wide instruction encoded as 8 hex characters. Note that you should be able to create, by hand, files encoding ARM instructions. Whatever way you choose, you should be able to have files to test your emulator.
- Note that the two things per line are the 8 hex characters which encode the address, followed by a space, then 8 hex characters which encode the instruction.
- Example Input
- Your system will be fed input in the following format
- 00000000 E3A00001
- 00000004 E2800001
- 00000008 00000042
- 0000000C 00000000
- 00000010 EAFFFFFB
- For reference, this maps to the following instruction sequence (you will not get this additional information in your test code)
- ADDRESS INSTRUCTION MNEMONIC MEANING
- ---------------------------------------------------
- 00000000 E3A00001 MOV r0, #1
- 00000004 E2800001 ADD r0, #1
- 00000008 00000042 DCD 0x42
- 0000000C 00000000 DCD 0
- 00000010 EAFFFFFB B FooLabel
- Automated assembly
- You can use the official ARM assembler to produce your own .emu files from a .s ARM assembly input file.
- Log on to snowy and run the following command in a directory with a file you wish to compile
- /home/staff/simon/COMS12600/emulator/arm-compile-emu <filename>
- Replace <filename> with the filename of your .s assembly file, MINUS the '.s' suffix. You will then get a '.emu' file, compiled as in the example above,
- plus a '.list' file, which shows a more detailed debug listing, including mnemonics. You can use this latter file to help your understanding and debug process.
- Output format
- The output of the emulator is -to the screen only-.
- An example of how the emulator may be executed is if you type in the command line
- >./emu -trace test.emu
- you should see something like this: R0=00000000 R1=0000001 R2=00000002 R3=00000000 R4=00000000 R5=00000042 R6=00000000 R7=00000000
- R8=00000000 R9=0000001 R10=00000000 R11=00000000 R12=00000000 SP=00000042 LR=00000000 PC=00000000
- Next Instruction=ADD R0, R1, R2
- R0=00000003 R1=0000001 R2=00000002 R3=00000000 R4=00000000 R5=00000042 R6=00000000 R7=00000000
- R8=00000000 R9=0000001 R10=00000000 R11=00000000 R12=00000000 SP=00000042 LR=00000000 PC=00000000
- Next Instruction=MOV R4, R0
- and so on...
- The above is the output of the emulation process line by line ("-trace option"). You can see the values of the different registers and which instruction is the one being executed. Naturally, depending on the program you test you will see different numbers.
- The bubble sort program
- This part of the task is about using the ARM instruction set to implement an algorithm. In this case, we want to implement bubblesort.
- This will round your experience in using an example assembly language and if you complete this stage you would have gone from coding assembly to translating it to machine code and thanks to the emulator being able to see it working.
- Here is the assembly file giving the outline of the bubble sort program. You MUST use this outline if you want to receive full marks for the relevant stage. If you inspect the program you will see that the numbers to sort are provided as data. However, if you want to try with other numbers, e.g. smaller numbers, you can do so and for that you will have to replace the data section with your own numbers. If you do this you can still get good marks but likely not the full marks as if you have used the provided data.
- You may want to start by seeing a C or pseudocode implementation of bubblesort and then translate that into assembly using SIMPLE instructions.
- Notes
- About emulator's I/O:
- 1) Input: a file in text mode with extension ".emu" containing the output of the assembly code (as described above, one 32-bit word per line encoded in hex)
- 2) Output: the output is to the screen and depending on the option selected
- Perhaps the thing that can be of most concern from the implementation point of view is the ALU, as it is the heart of the architecture. I certainly do not expect that you emulate the ALU at a digital-logic level(!). Therefore you are free to implement the ALU and its operations using standard C math funcitions. Just bear in mind the definition of each instruction as per the instruction set.
- Numbers are in 2s complement and therefore your emulated ALU should be aware of this.
- Reference
- You may find the following documents useful:
- The ARM ARM
- ARM instruction set reference
- ARM Quick reference card
- The Stages
- The assignment is presented in stages, with marks being allocated for how many stages you successfully complete and the general quality of your work. You shouldn't necessarily expect to complete all the stages; part of the assessment is how far you manage to get in the available time. Remember that you may be able to attempt some of the later stages without completing the earlier ones.
- Stage 1
- Write an emulator that is able to read a ".emu" file in the format described above and detect programs that are not correct such as having more or less than 32 bits per line, or that have non recognized instructions such as unknown opcodes. The program should be written so it is executed using a command similar to
- ./emu [option(s)] input.emu
- At this stage, you should be capable of demonstrating the capabilities of your emulator by producing some simple input programs and displaying the output from your emulator as it executes these.
- Stage 2
- Implement the option "-trace" which should show how the program is executed and how the different registers are affected as the emulator goes trough each instruction.
- Stage 3
- Produce memory dumps by implementing options "-before" and "-after".
- For each location in memory that holds a valid instruction or data value, print it out in the following format: <address>, <value (in hex)>
- e.g.
- 0x00000000 0x12345678
- 0x00000004 0x23ac2000
- and so on
- This should be done in a way that only things that have changed are shown rather than the entire state of memory.
- Stage 4
- Implement a working bubblesort algorithm in ARM assembly. You are given the starting point and input data values for this algorithm and you may replace the data section to use your own numbers, but full marks for this stage are only available if you use the provided data. Use the "SVC 2" debug instruction to print out each element of the array in turn once sorted, one per line.
- Files
- In addition to these stages you should provide evidence of your emulator working in ".emu" files as described below.
- The Submission
- You should submit under the Emulator entry ("Emu") using the online submission system. With your submission you must include a completed marksheet, this will enable the marking process to be quicker and more accurate. Feel free to record any notes or comments you think are important in this file, in particular how to compile and run your submission and any justification of decisions you made during your work.
- You should submit the following files seperately: DO NOT .ZIP YOU FILES!
- The source code for your emulator. The main (and possibly only) file should be called 'emu.c'.
- Your program MUST either be compilable with
- gcc emu.c -o emu
- or, if it is more complicated then submit a Makefile which contains the correct rules.
- Assembled test files you used to verify your emulator. These should be called error01.emu or test01.emu and so on.
- Proof files with e.g. copy-pasted screen output from your tests, i.e. the trace or memory dumps.
- These should be named test01.out and so on.
- Your bubblesort file and evidence of it working. These should be called bubble.s, bubble.emu and bubble.out
- The marksheet file.
- Failure to comply exactly with the guidelines above is likely to result in a loss of marks.
- Marking
- Your submitted code will be marked in accordance with the supplied marking scheme. Tests will be run on your code, which will examine whether it behaves correctly when presented with programs that will stress the various aspects of the ARM architecture. Therefore, remember that the total marks from the stages you submit for marking represent the upper bound of what you can achieve. If any errors in the correctness of your emulator are discovered during the testing of your emulator, you will have your marks reduced accordingly. Therefore, play safe and submit stages worth significantly more than the 40% pass mark for the course!

