Create a tokenizer in Ruby. Your tokenizer will take two command line arguments:
ID: 3877929 • Letter: C
Question
Create a tokenizer in Ruby.
Your tokenizer will take two command line arguments: The first will be a grammar specification and the second will be a file to tokenize. Tokenize the file and print the tokens (symbol, lexeme, line) to the screen. If the file cannot be tokenized, print an error message identifying the line with the error.
grammar specification file:
NUM -> d+
ADDOP -> [-+]
MULOP -> [*/]
LP -> (
RP -> )
EQ -> =
ID -> [A-Z]w*
comment -> {[^}]*}
S -> ID EQ expr
expr -> expr ADDOP term | term
term -> term MULOP factor | factor
factor -> ID | NUM | LP expr RP
eample file 1 to tokenize:
4+2 { this is
a comment }
+ 6
eample file 2 to tokenize:
1
+
2 *
3
Code so far:
Explanation / Answer
tokenizer.rb
class Tokenizer
OPERATORS = ['+', '-', '*', '/']
NUMBER_REGEXP = /^d*(.d+)?$/
attr_reader :tokens, :cursor
def initialize expression
tokenize expression
@cursor = -1
end
def current_token
return if cursor == -1
tokens[cursor]
end
def next_token
cursor += 1
current_token
end
def look_next_token
tokens[cursor + 1]
end
def operator_included?
tokens.map(&:type).include? :operator
end
private
def tokenize expression
@tokens = []
expression_to_parse = expression.strip
while expression_to_parse.size > 0
tokens << read_next_token(expression_to_parse)
end
end
def read_next_token expression_to_parse
expression_to_parse.strip!
next_char = expression_to_parse.slice 0
if OPERATORS.include? next_char
read_next_operator_token expression_to_parse
elsif next_char =~ NUMBER_REGEXP
read_next_numeric_token expression_to_parse
else
raise "Unknown token starting with '#{next_char}'"
end
end
def read_next_operator_token expression_to_parse
Token.new :operator, expression_to_parse.slice!(0)
end
def read_next_numeric_token expression_to_parse
numeric_value = expression_to_parse.slice! 0
while next_char = expression_to_parse.slice(0) && "#{numeric_value}#{next_char}" =~ NUMBER_REGEXP
numeric_value << expression_to_parse.slice!(0)
end
Token.new :number, numeric_value.to_f
end
end
parser.rb
class Parser
attr_reader :tokenizer
# Parse a polish notation expression and return the result
def parse expression
@tokenizer = Tokenizer.new expression
raise 'Expression must start with an operator' if tokenizer.look_next_token.type == :number
operate
end
def tokens
tokenizer.tokens
end
def operate
number_stack = []
# continue while the last operator are popped from tokens
while tokenizer.operator_included?
current_token = tokens.pop
if current_token.type == :number
# push the number in the number_stack
number_stack << current_token.value
elsif current_token.type == :operator
raise "Not enough operands" if number_stack.size < 2
# take the last 2 number added to the stack
first_value, second_value = number_stack.pop(2)
# Add a new :number Token with the result of th operation
tokens << Token.new( :number, first_value.send( current_token.value, second_value ) )
end
end
raise "Not enough operators" if number_stack.size > 0
tokens.first.value
end
end
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.