data structure, before it can print it in some indented style. The pretty-printe
ID: 3543633 • Letter: D
Question
data structure, before it can print it in some indented style. The pretty-printer consists of the following parts:
1. a lexical analyzer that splits the input text into tokens,
2. a recursive-descent parser that analyses the structure of the input program and builds a parse tree,
3. a parse-tree traversal that pretty-prints the input program.
Lexical Analysis
In lexical analysis, the input file is broken into individual words or symbols. These words or symbols are
then represented as tokens and passed to the parser. Your lexical analyzer needs to read ASCII characters
from standard input and do the following:
1. discard white space (space, tab, newline, carriage-return, and form-feed characters),
2. discard comments (everything from a semicolon to the end of the line),
3. recognize quotes, dots, and open and closing parentheses,
4. recognize the boolean constants #t and #f,
5. recognize integer constants (for simplicity only decimal digits without a sign),
6. recognize string constants (anything between double quotes),
7. recognize identifiers.
Parser
The parser gets tokens from the scanner and analyzes the syntactic structure of the token stream. Since
Scheme has a very simple grammar, parsing using a recursive-descent parser is not difficult.
Pretty Printing
Print the output according to the following rules:
Explanation / Answer
Structure of the Pretty-Printer
Like any tool that processes text, the pretty-printer needs to parse the input ?rst and store it in an internal
data structure, before it can print it in some indented style. The pretty-printer consists of the following parts:
1. a lexical analyzer that splits the input text into tokens,
2. a recursive-descent parser that analyses the structure of the input program and builds a parse tree,
3. a parse-tree traversal that pretty-prints the input program.
Lexical Analysis
In lexical analysis, the input ?le is broken into individual words or symbols. These words or symbols are
then represented as tokens and passed to the parser. Your lexical analyzer needs to read ASCII characters
from standard input and do the following:
1. discard white space (space, tab, newline, carriage-return, and form-feed characters),
2. discard comments (everything from a semicolon to the end of the line),
3. recognize quotes, dots, and open and closing parentheses,
4. recognize the boolean constants #t and #f,
5. recognize integer constants (for simplicity only decimal digits without a sign),
6. recognize string constants (anything between double quotes),
7. recognize identi?ers
Scheme does not distinguish between uppercase and lowercase characters in identi?ers. It is, therefore,easiest for later parts of the pretty printer or the interpreter to convert all letters to lowercase.The typical structure of a lexical analyzer is to write a function getNextToken(),
enum TokenType {
QUOTE, DOT, LPAREN, RPAREN, TRUET, FALSET, INT, STRING, IDENT
};
class Token {
TokenType tt;
// ...
};
class IntToken : public Token {
int intVal;
// ...
};
class StrToken : public Token {
char * strVal;
// ...
};
class IdentToken : public Token {
char * name;
// ...
};
class Scanner {
public:
Token * getNextToken ();
// ...
parser
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.