Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

CSE340 Spring 2019 Project 3: Parsing and Type Checking Due: Friday 29 March 2019 on or before 11:59 pm MST In this project, you are asked to write a parser and a type checker for a small language....

1 answer below »
CSE340 Spring 2019 Project 3: Parsing and Type Checking
Due: Friday 29 March 2019 on or before 11:59 pm MST
In this project, you are asked to write a parser and a type checker for a small language. The parse
checks that the input is syntactically co
ect and the type checker enforces the semantic rules of the
language. The semantic rules that your program will enforce relate to declarations and types.
The input to your code will be a program and the output will be:
• syntax e
or message if the input program is not syntactically co
ect
• if the input program has no syntax e
or, the output is:
– semantic e
or messages if there is a declaration e
or or a type mismatch in the input pro-
gram
– information about the types of the symbols declared in the input program if there is no se-
mantic e
or.
In what follows, I describe the language syntax and semantic rules.
1. Language Syntax
1.1. Grammar Description
program → scope
scope → LBRACE scope list RBRACE
scope list → scope
scope list → var decl
scope list → stmt
scope list → scope scope list
scope list → var decl scope list
scope list → stmt scope list
var decl → id list COLON type name SEMICOLON
id list → ID
id list → ID COMMA id list
type name → REAL | INT | BOOLEAN | STRING
stmt list → stmt
stmt list → stmt stmt list
stmt → assign stmt
stmt → while stmt
assign stmt → ID EQUAL expr SEMICOLON
while stmt →WHILE condition LBRACE stmt list RBRACE
while stmt →WHILE condition stmt
expr → arithmetic operator expr exp
expr → binary boolean operator expr exp
expr → relational operator expr exp
expr → NOT exp
expr → primary
arithmetic operator → PLUS | MINUS | MULT | DIV
inary boolean operator → AND | OR | XOR
elational operator → GREATER | GTEQ | LESS | NOTEQUAL | LTEQ
primary → ID | NUM | REALNUM | STRING CONSTANT | bool const
ool const → TRUE | FALSE
condition → LPAREN expr RPAREN
Notice that expressions are in prefix notation in this language. This makes parsing easier.
1
1.2. Tokens
The tokens used in the grammar description are:
LBRACE = {
RBRACE = }
COLON = :
SEMICOLON = ;
REAL = REAL
INT = INT
BOOLEAN = BOOLEAN
STRING = STRING
WHILE = WHILE
COMMA = ,
LPAREN = ’(’
RPAREN = ’)’
EQUAL = =
PLUS = +
MULT = *
DIV =
AND = ^
OR = ’|’
XOR = &
NOT = ~
GREATER =
GTEQ = >=
LESS =
LTEQ = <=
NOTEQUAL =
TRUE = TRUE
FALSE = FALSE
STRING\_CONSTANT = " (letter | digit)* "
ID = letter (letter | digit)*
NUM = 0 | (pdigit digit*)
REALNUM = NUM . digit digit*
where
letter = a | b | c | ... | z | A | B | C | ... | Z
digit = 0 | 1 | 2 | ... | 9
pdigit = 1 | 2 | 3 | ... | 9
The tokens are included for completeness. You are provided with a getToken() function for these
tokens.
2. Language Semantics
2.1. Scoping Rules
Static scoping is used. I have already covered static scoping in class, so you should know the semantics
and how references should be resolved. Every scope defines a scope.
2
2.2. Types
The language has four basic types: INT , REAL , BOOLEAN , and STRING .
2.3. Variables
Variables are declared explicitly in var decl . The names of the declared variables appear in the
id list and their type is the type name .
Example Consider the following program written in our language:
{
x : INT;
y : INT;
y = x;
}
This program has two declared variables: x and y . Both have type INT .
2.4. Declaration and Reference
Any appearance of a name in the program is either a declaration or a reference. Any appearance of
a name in the id list part of a var decl is a declaration of the name. Any other appearance of a
name is considered a reference to that name. Note that the above definitions exclude the built-in type
names, which are predeclared as part of the language definition.
Given the following example (the line numbers are not part of the input):
01 {
02 a : INT;
03 b : REAL;
04 b = x;
05 }
We can categorize all appearances of names as declaration or reference:
• Line 2, the appearance of name a is a declaration
• Line 3, the appearance of name b is a declaration
• Line 4, the appearance of name b is a reference
• Line 4, the appearance of name x is a reference
2.5. Type System
We describe which assignments are valid (type compatibility) and how to determine the types of expres-
sions (type inference).
2.5.1. Type Compatibility Rules
Type compatibility rules specify which assignments are valid. The Rules are the following.
• C1: If the types of the lefthand side of an assignment is INT , BOOLEAN , or STRING , the righthand
side of the assignment should have the same type.
3
• C2: If the type of the lefthand side of an assignment is REAL , the type of the righthand side of the
assignment should be INT or REAL .
• M1: Assignments that do not satisfy C1 or C2 are not valid. In this case, we say that there is a
type mismatch.
2.5.2. Type Inference Rules
• C3: The types of the operands of an arithmetic operator should be REAL or INT
• C4: The type of the operands of a binary boolean operator should be BOOLEAN .
• C5: If neither operand of a relational operator has type INT or REAL , then the operands
should have the same type. In this case, both types can be STRING or both types can be
BOOLEAN .
• C6: If one of the operands of a relational operator has type INT or REAL , then the othe
operand should have type INT or REAL . In this case, the two operands can have different types.
• C7: The type of a condition should be BOOLEAN .
• C8: The type of the operand of the NOT operator should be boolean.
• I1: If C3 is satisfied, and the type of one of the operands of an arithmetic operator is REAL , the
type of the resulting expression is REAL .
• I2: For the PLUS , MINUS , and MULT operators, if the types of the two operands are INT , the
type of the resulting expression is INT .
• I3: For the DIV operator, if the types of the two operands are INT , the type of the resulting
expression is REAL .
• I4: If the operands of a binary boolean operator are BOOLEAN , the type of the resulting ex-
pression is BOOLEAN .
• I5: If C5 and C6 are satisfied, the type of a relational operator expr expr is BOOLEAN .
• I6: The type of NUM constants is INT
• I7: The type of REALNUM constants is REAL
• I8: The type of bool const constants is BOOLEAN
• I9: The type of STRING CONSTANT constants is STRING
• M2: If C3, C4, C5, C6, C7 or C8 are not satisfied, the type of the expressions is ERROR. In this
case, we say that there is a type mismatch.
4
3. Parse
You must write a parser for the grammar, If your parser detects a syntax e
or in the input, it should
output the following message and exit:
Syntax E
o
You should start coding by writing the parser first and make sure that your parser generates a syntax
e
or message if the input program is syntactically inco
ect. There will be no partial credit given fo
syntax checking
The grammar of the language is not LL(1) i.e. it does not satisfy the conditions for predictive parser with
one token lookahead, however, it is still possible to write a recursive descent parser with no backtracking.
We can do this by looking ahead at more than one token in some cases and left-factoring some rules.
Two rules like
stmt list → stmt
stmt list → stmt stmt list
can be parsed by first parsing stmt and then checking if the next token is the beginning of stmt list
or in the FOLLOW of stmt list . In fact the two rules are equivalent to
stmt list → stmt stmt list1
stmt list1 → stmt list | �
ut you do not need to explicitly left-factor the rules by introducing stmt list1 to parse stmt list .
4. Semantic Checking
Your program will check for the following semantic e
ors and output the co
ect message when it en-
counters that e
or.
4.1. Declaration E
ors
• Variable declared more than once (e
or code 1.1)
A variable is declared more than once if appears more than once in the same id list of a
var decl or if it appears in two id list of two different var decl and locally looking up the
name that appears in one of the two declarations returns the other declaration.
• Undeclared variable (e
or code 1.2)
If resolving a variable name that appears in a statement other than a declaration returns no dec-
laration, the variable is undeclared.
• Variable Declared but not used (e
or code 1.3)
If a variable is declared but is not referenced, we say that the variable is declared but not used. We
say that a declaration is referenced if some reference to the variable resolves to the declaration.
If a declaration is not referenced, we have e
or code 1.3.
• Redeclaration or reference as a variable for of a built-in type name If a built-in type name
appears in id list of a variable declaration or appears in a statement other than a declaration
statement, your parser should output syntax e
or.
5
For each e
or involving declarations, your type checker should output one line in the following format:
ERROR CODE in which should be replaced with the proper code (see the e
or codes listed above) and
symbol name> should be replaced with the name of the variable that caused the e
or. Since the test
cases will have at most one e
or each, the order in which these e
or messages are printed does not
matter.
Note that there will only be at most one declaration e
or per test case.
4.2. Type Mismatch
If there are no declaration e
ors and any of the type constraints (listed in the Type System section
above) is violated in the input program, then the output of your program should be:
TYPE MISMATCH Where is replaced with the line number that the violation occurs and should be replaced with the label of the violated
Answered Same Day Mar 13, 2021

Solution

Vidhi answered on Mar 20 2021
146 Votes
updated/inputbuf.cc
updated/inputbuf.cc
*
 * Copyright (C) Rida Bazzi, 2018
 *
 * Do not share this file with anyone
 *
#include #include #include #include #include #include "inputbuf.h"
using namespace std;
ool InputBuffer::EndOfInput()
{
    if (!input_buffer.empty())
        return false;
    else
        return cin.eof();
}
char InputBuffer::UngetChar(char c)
{
    if (c != EOF)
        input_buffer.push_back(c);;
    return c;
}
void InputBuffer::GetChar(char& c)
{
    if (!input_buffer.empty()) {
        c = input_buffer.back();
        input_buffer.pop_back();
    } else {
        cin.get(c);
    }
}
string InputBuffer::UngetString(string s)
{
    for (int i = 0; i < s.size(); i++)
        input_buffer.push_back(s[s.size()-i-1]);
    return s;
}
updated/inputbuf.h
*
* Copyright (C) Rida Bazzi, 2017
*
* Do not share this file with anyone
*
#ifndef __INPUT_BUFFER__H__
#define __INPUT_BUFFER__H__
#include class InputBuffer {
public:
void GetChar(char&);
char UngetChar(char);
std::string UngetString(std::string);
bool EndOfInput();
private:
std::vecto
cha
input_buffer;
};
#endif ...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here