PoPL/03_syntax_tree
Bananymous 7a306b7904 Add phase 3 text document 2024-04-13 23:26:13 +03:00
..
.gitkeep spring 2024 git template 2023-12-18 07:45:29 +02:00
README.md Add phase 3 text document 2024-04-13 23:26:13 +03:00
lexer.py Implement phase 3 semantic checking 2024-04-13 22:56:20 +03:00
main.py Implement phase 3 semantic checking 2024-04-13 22:56:20 +03:00
tree_print.py Implement phase 3 semantic checking 2024-04-13 22:56:20 +03:00

README.md

Phase 3, Syntax analysis

  1. Abstract syntax tree or AST is intermediate format of the program that is to be run. This is a tree structure defining the relation between every other node. A single node in the tree defines usually a single operation, like variable definition, binary operation, etc. Abstract syntax tree is built after the program has been tokenized by the lexer. You can either execute the AST straight away, perform some optimizations or compile it further to byte code or machine code.
  2. With PLY you can build AST using its implementation of yacc. You pass the yacc a byte stream and a lexer with which it confirms the syntax using your syntax definitions. Syntax definitions are in BNF format. In each BNF statement you define each node and how it relates to other nodes. For example when yacc matches variable definition you can store variable name and the initialization value as current nodes children.
    1. Variable definition creates a new AST node with type variable_definition. This node's value is set to the name of the variable. variable_definition node gets a child_expression field defining the initialization value.
    2. While loop creates a new AST node with type do_until. This node gets a child_condition that defines the expression for the whiles loops condition. do_until node also gets children_statements which is a list of statements containing the while loop's body.
    3. Procedure call creates a new AST node with type procedure_call. This node's value is set to the name of the procedure this calls to. procedure_call also contains children_arguments that is a list of arguments that are passed along in the call.
    1. When calling a function or procedure, argument list is None. When program doesn't have any definitions or body, its corresponding children are None. When function or procedure is defined without arguments or variable list their corresponding children are None. When while loop doesn't have otherwise case its otherwise body is None.
    2. Lists in my implementation don't have their own node. Parent of the list just has a list of the corresponding children. For example program has children_definitions instead of a child_defition_list with the actual children. This allows omitting some unnecessary nodes. Also BNF atom and factor are not actual nodes and they just pass on their data as its own node.
  3. The assignment was a fun little project. This assignment was not too difficult. Most of the implementation was super easy when building on top of the last phase. I learned to use more of PLY. I was already familiar with AST and how semantic checking is done. In general I liked the assigment and it was a fun little project.