Programming Assignment

>>Okay, so, this assignment,
we will not be generating code. So, we will be writing
a parser and a scanner for a very simplified version of the C language
that we call BabyC. So, let’s look at the
MakeFile maybe first. Okay. So, we will be —
there will be multiple files. You know, there will be the, you
know, the LEX file, BabyC.LEX. This will have the
regular expressions that define the microsyntax
of the language, and there will be BabyC.Y
and that will have — it will have the grammars and the embedded
actions for the language. So, BabyC.Y will be processed
by Bison, the parser generator. So, it will process BabyC.Y,
and it will generate C code. That C code will be in
this file, BabyC.tab.c. So, will generate the
C code for you, and then you compile the
C code using a C compiler, which is GCC in this case. Okay? So, this is the
MakeFile that does, you know, that does this for you. So, you don’t have to worry
about this but to understand that we have multiple files. And similarly, for the LEX
file, it’s processed by flex. Flex processes the
file that has LEX that has the regular
expressions. So, this is the file that
has the regular expressions to look like this. So, you’ll just put
regular expressions. By the way, I put
everything in there for you. So, you only need to — the
syntax is weird, you know, when you look at
it the first time. It’s — to look unfamiliar,
but basically, you know, I did most of the stuff for you. So, these are the regular
expressions, and in fact, you know, for the, you know,
these symbols, these symbols in the language like
comma, semicolon, and the plus and minus. The regular expression for
this is the symbol itself. So, the regular expression
for comma is comma, it just returns a comma. The only two constructs
that require an actual, regular expression are
the identifier and number, and this is something
that we did in class. I just want you to write them. So, you just write, according
to the description, you know, in the problem description. So, you write a regular
expression for identifier as described in the
problem statement, and a regular expression
for number. And these are the
actions that will happen. So, for an identifier,
it’s going to return ident. You know, so, this is the
return value of the scanner. The scanner is returning
this value to the parser. So, the scanner returns
identifier. It has just, you know,
recognized an identifier based on the regular expression here. And it will have to store
the actual lexeme somewhere. So, this is where we, you
know, store the actual lexeme in this data structure. This data structure
is the data structure that parser will have access to. So, for the number, we
return — do the same thing. We return a number,
and we store the number in this data structure. So, I’m doing all the work for
you here, but you just need to write the regular
expression for identifier, regular expression for number. Now this returns its
values for the parser. So, let’s go back to
the MakeFile again. So, we processed the
scanner with Flex. We process the parser
with Bison. Then we have C files. We compile the C files, and we’ll have the object
files, and we link them. So, the one that comes from
the scanner, the one that comes from the LEXer, and
there is a driver that I’m providing for you. And there is a file that
has your actual code, which is your code .C. So,
let me show you driver now. So, the driver is going
to look very simple. So, it’s going to open the file. Then it’s going to call YY Pass. This calls the parser. Now who calls the scanner? The parser does. So, the parser calls the
scanner to get the next token. You know, remember
that getnext word in the pseudocode that we wrote? So, the parser will
call the scanner to get the next token,
or the next word. So, we just call the
parser, and that’s it. And this is all what
we need to do. Then we need to add
something to generate code to process the abstract
syntax tree, but this is common
to the out now. I don’t want you
to work on this. I want you to focus
on the parser and on building the abstract
syntax tree in this assignment. So, most of the work is going
to be in the parser file. So, the parser file
will have the grammar. So, it will have the
grammar for the language. Some of this will look familiar. So, these are grammars, and
these are the embedded actions. Okay, so, for example, here
I put an embedded action for recognizing a declaration. A declaration is an, you
know, integer identifier. So, integer, you know,
this is between quotation, because this is a
string recognized as is, and identifier is what? It’s something that will
come from a scanner. So, this is something that
comes from the scanner. Now when you recognize
a declaration, you need to add a
declaration to write some code to handle this declaration. And here, I’m putting a print
statement processing declaration of this variable. Now what do you think the code for processing a
declaration should do? What’s the code? So, there is an add
declaration function that you will need to write. What should you put in there?>>Insert it into
the symbol table?>>Exactly. Insert it into the symbol
table under one condition.>>It’s not already in there.>>Yeah, if it’s not
already in there. If it’s already in
there, which we do?>>Print error.>>Yeah, print error,
because we are trying to declare a variable
twice, okay? So, if it’s, well, the
programmer is trying to declare a variable twice. So, you check on it
in the symbol table. If it’s in there, you
printed error message. It’s already declared. You cannot declare
it more than once. If it’s not declared, then
you add to the symbol table. And, you know, similarly,
for the — okay, for — I’m only showing statement. I’m not showing the
entire grammar, because I want you
to write the grammar. So, basically, the grammar
for this language is something that we wrote, you know, in
different lectures and examples. You know, homework
assignments and exams. So, we very much covered
everything, all the productions that you need in order
to define this language. It’s a simple version of C that
has only if statements, if else, and it has only Y loops, and
it only has integer variables. And it has expressions. It has arithmetic and
logical expressions. That is all what it has, and so,
your job is to use these tools to build a compiler for this. So, your code here
is it write — your job here is to write
the embedded actions, and the actual functions will
be in your code.c So, in this — here in these actions, don’t try to do some — to
write actual code. Just call a function and put the
functions that do the real work in your code.c. In fact,
your code.c is a file that I’m giving you, but you can
break it up into multiple files. So, you can create your own, for
example, AST.c and symboltable.c and codegeneration.c. You can
break it up into three files to better organize your code. And here, you know, you put
the create-node functions that will create the
nodes, and each one of them will return a
point to an AST node. Yeah?>>So, we’re going to implement
our own symbol table, as well?>>Yes, and the symbol
table, you know, what do you think would be
a reasonable implementation of a symbol table? Yeah, well, what would be
the best data structure. Hash table, yeah, hash table. So, it should be
implemented as a hash table. Here, you know, it
will make a difference. We can feel the difference
between a hash table and any other implementation. If the number of
identifiers defined in the program is very large,
so that we can field, you know, the difference in
the search speed. But if we don’t have
very many variables, then we will not notice the
difference between a hash table and any other data structure like a list or array
or whatever. So, the implementation of the
symbol table is up to you. And designing the data
structures is up to you, as long as this works. But in this assignment,
assignment number four, you will not generate code. You will just have
print statements that print what is
being recognized and what is being done. You know, print statements like
processing declaration of so and so, or creating and add
node, creating a multiply node. And we will provide you with
sample input and output files, and your outputs must
match our output.

Tags: , ,

There are no comments yet

Why not be the first

Leave a Reply

Your email address will not be published. Required fields are marked *