Problem Set 6

		 Harvard Extension School CSCI E-95:
		  Compiler Design and Implementation

			      Fall 2023

	 Due: November 26, 2023 at Midnight ET (Eastern Time)

1. (100 Points) Based on all the work you have completed so far, your
lexer from Problem Set 1, your parser from Problem Set 2, your symbol
table management system from Problem Set 3, the type checker from
Problem Set 4, and the intermediate code generator from Problem Set 5,
develop code to traverse the intermediate code doubly-linked list and
generate MIPS assembly language acceptable to SPIM.

The input to Problem Set 6 is the doubly-linked list of intermediate
code (IR) in the three-address quadruple representation we discussed
in class.  The output is a file containing MIPS assembly language
statements as accepted by the SPIM interpreter which is available on
our cscie95.dce.harvard.edu instance and also downloadable over the
Web (see SPIM: A MIPS32 Simulator at either
http://spimsimulator.sourceforge.net/ or
http://pages.cs.wisc.edu/~larus/spim.html).  The generation of MIPS
assembly language should be performed in a straight-forward manner.
Generate good code, but do not attempt to perform any optimization.
Each IR node should generate assembly statements -- do not attempt to
generate code based on a sequence of IR nodes.  As detailed in the
Intermediate Representation slides, use labels in the MIPS code you
generate -- the labels may be used as targets of branches (in which
case they are either labels from the source program with a prefix of
_UserLabel_functionName_ or automatically generated by the code
generation module of your compiler using the naming convention
described in the slides), the labels may be used for function names
and static global variables (in which case they should be the same as
the names declared in the source file being compiled except with a
prefix of _Global_), or the labels may be used to refer to literal
strings using a prefix of _StringLabel_ applied to an integer
designator.  The first function to be called must be named "main" and
that name should not be changed in any way.

Code generation for calling functions and returning from functions and
for access to local variables should use the stack frames and calling
conventions that we discussed in class.  Local variables should be
allocated on the stack and accessed through the frame pointer,
register $fp ($30).  Follow the protocol that we discussed in class
for passing parameters and returning results from functions.  In order
to be able to access local variables, your compiler will need to
traverse the symbol tables for each function and assign offsets for
all local variables.  You will also need to compute the offset in each
function's stack frame where the old frame pointer and return address
will be stored.  Remember to share memory in the stack frame for
variables defined in nested blocks.  Also, for each function, keep
track of the size of the required stack frame.

When you first attempt to generate code, try to compile only very
small programs using the temporaries in the IR nodes as actual
registers.  Obviously, this will work only when there are no more
temporaries than actual registers.  Also, at first, only use global
variables.  After this is working, write code to reset the register
generated for temporaries at the start of each language construct
across which temporaries are not used (for example, this might be at
the start of each statement).  After successfully testing your
compiler with resetting the temporary assignment to registers, start
to implement local, automatic variables.  The next step is to work on
the ability to call user functions.  Start with functions that take no
parameters and return no value and then work up from there.  Finally,
you may attempt to write a module to perform register allocation based
on the graph coloring algorithm discussed in class and in the textbook
(Aho, Lam, Sethi, Ullman on pages 556-557).

You should incorporate built-in pre-defined "functions" that map
directly to the following system calls provided by SPIM: print_int,
print_string, read_int, read_string, and exit -- these are described
in the MIPS Assembly Language slides.  These would be declared as:

void syscall_print_int(int integer);
void syscall_print_string(char *string);
int syscall_read_int(void);
void syscall_read_string(char *buffer, int length);
void syscall_exit(void);

Of course, as shown in the factorial example on the class web site,
these special pre-defined "functions" will generate appropriate
"syscall" MIPS instruction sequences rather than jump-and-link
sequences.  You can either have the declarations for these "functions"
pre-exist in the global/file scope symbol table before the user's
program is read, or you can require the user to include an appropriate
declaration for each pre-defined "function" before they invoke that
"function."

Your problem set solution should include the source program to your
compiler, the output from the previous compiler stages, a printout of
the MIPS code produced by this code generation component, and a
transcript of running the MIPS code under SPIM.

As always, the solution that you turn in should include a
demonstration that your program works and has been tested.  Thus, the
test programs you use as input and the output that your compiler
produces should be included with your other files.


			Last revised 4-Sep-23