PoPL/03_syntax_tree/README.md

12 lines
2.7 KiB
Markdown

## Phase 3, Syntax analysis
1. Abstract syntax tree or AST is intermediate format of the program that is to be run. This is a tree structure defining the relation between every other node. A single node in the tree defines usually a single operation, like variable definition, binary operation, etc. Abstract syntax tree is built after the program has been tokenized by the lexer. You can either execute the AST straight away, perform some optimizations or compile it further to byte code or machine code.
2. With PLY you can build AST using its implementation of yacc. You pass the yacc a byte stream and a lexer with which it confirms the syntax using your syntax definitions. Syntax definitions are in BNF format. In each BNF statement you define each node and how it relates to other nodes. For example when yacc matches _variable definition_ you can store variable name and the initialization value as current nodes children.
3.
1. Variable definition creates a new AST node with type _variable\_definition_. This node's value is set to the name of the variable. _variable\_definition_ node gets a _child\_expression_ field defining the initialization value.
2. While loop creates a new AST node with type _do\_until_. This node gets a _child\_condition_ that defines the expression for the whiles loops condition. _do\_until_ node also gets _children\_statements_ which is a list of statements containing the while loop's body.
3. Procedure call creates a new AST node with type _procedure\_call_. This node's value is set to the name of the procedure this calls to. _procedure\_call_ also contains _children\_arguments_ that is a list of arguments that are passed along in the call.
4.
1. When calling a function or procedure, argument list is None. When program doesn't have any definitions or body, its corresponding children are None. When function or procedure is defined without arguments or variable list their corresponding children are None. When while loop doesn't have otherwise case its otherwise body is None.
2. Lists in my implementation don't have their own node. Parent of the list just has a list of the corresponding children. For example program has _children\_definitions_ instead of a _child\_defition\_list_ with the actual children. This allows omitting some unnecessary nodes. Also BNF _atom_ and _factor_ are not actual nodes and they just pass on their data as its own node.
5. The assignment was a fun little project. This assignment was not too difficult. Most of the implementation was super easy when building on top of the last phase. I learned to use more of PLY. I was already familiar with AST and how semantic checking is done. In general I liked the assigment and it was a fun little project.