Problem Set 6 Harvard Extension School CSCI E-95: Compiler Design and Implementation Fall 2023 Due: November 26, 2023 at Midnight ET (Eastern Time) 1. (100 Points) Based on all the work you have completed so far, your lexer from Problem Set 1, your parser from Problem Set 2, your symbol table management system from Problem Set 3, the type checker from Problem Set 4, and the intermediate code generator from Problem Set 5, develop code to traverse the intermediate code doubly-linked list and generate MIPS assembly language acceptable to SPIM. The input to Problem Set 6 is the doubly-linked list of intermediate code (IR) in the three-address quadruple representation we discussed in class. The output is a file containing MIPS assembly language statements as accepted by the SPIM interpreter which is available on our cscie95.dce.harvard.edu instance and also downloadable over the Web (see SPIM: A MIPS32 Simulator at either http://spimsimulator.sourceforge.net/ or http://pages.cs.wisc.edu/~larus/spim.html). The generation of MIPS assembly language should be performed in a straight-forward manner. Generate good code, but do not attempt to perform any optimization. Each IR node should generate assembly statements -- do not attempt to generate code based on a sequence of IR nodes. As detailed in the Intermediate Representation slides, use labels in the MIPS code you generate -- the labels may be used as targets of branches (in which case they are either labels from the source program with a prefix of _UserLabel_functionName_ or automatically generated by the code generation module of your compiler using the naming convention described in the slides), the labels may be used for function names and static global variables (in which case they should be the same as the names declared in the source file being compiled except with a prefix of _Global_), or the labels may be used to refer to literal strings using a prefix of _StringLabel_ applied to an integer designator. The first function to be called must be named "main" and that name should not be changed in any way. Code generation for calling functions and returning from functions and for access to local variables should use the stack frames and calling conventions that we discussed in class. Local variables should be allocated on the stack and accessed through the frame pointer, register $fp ($30). Follow the protocol that we discussed in class for passing parameters and returning results from functions. In order to be able to access local variables, your compiler will need to traverse the symbol tables for each function and assign offsets for all local variables. You will also need to compute the offset in each function's stack frame where the old frame pointer and return address will be stored. Remember to share memory in the stack frame for variables defined in nested blocks. Also, for each function, keep track of the size of the required stack frame. When you first attempt to generate code, try to compile only very small programs using the temporaries in the IR nodes as actual registers. Obviously, this will work only when there are no more temporaries than actual registers. Also, at first, only use global variables. After this is working, write code to reset the register generated for temporaries at the start of each language construct across which temporaries are not used (for example, this might be at the start of each statement). After successfully testing your compiler with resetting the temporary assignment to registers, start to implement local, automatic variables. The next step is to work on the ability to call user functions. Start with functions that take no parameters and return no value and then work up from there. Finally, you may attempt to write a module to perform register allocation based on the graph coloring algorithm discussed in class and in the textbook (Aho, Lam, Sethi, Ullman on pages 556-557). You should incorporate built-in pre-defined "functions" that map directly to the following system calls provided by SPIM: print_int, print_string, read_int, read_string, and exit -- these are described in the MIPS Assembly Language slides. These would be declared as: void syscall_print_int(int integer); void syscall_print_string(char *string); int syscall_read_int(void); void syscall_read_string(char *buffer, int length); void syscall_exit(void); Of course, as shown in the factorial example on the class web site, these special pre-defined "functions" will generate appropriate "syscall" MIPS instruction sequences rather than jump-and-link sequences. You can either have the declarations for these "functions" pre-exist in the global/file scope symbol table before the user's program is read, or you can require the user to include an appropriate declaration for each pre-defined "function" before they invoke that "function." Your problem set solution should include the source program to your compiler, the output from the previous compiler stages, a printout of the MIPS code produced by this code generation component, and a transcript of running the MIPS code under SPIM. As always, the solution that you turn in should include a demonstration that your program works and has been tested. Thus, the test programs you use as input and the output that your compiler produces should be included with your other files. Last revised 4-Sep-23