4.7 KiB
Phase 2, Syntax analysis
- Syntax analysis is validation of tokens parsed in lexical analysis. Syntax analysis checks that programs syntax (i.e. function/variable definitions) are in the correct form. Syntax analysis checks that correct tokens precede and follow each token.
- In PLY syntactic structure of program is described in BNF like format as docstrings of functions. You define a symbol and its list of possible expressions formed from other symbols or tokens parsed by PLY lexer.
-
- The do-unless statement in EBNF format is
DO statement_list UNLESS expression [OTHERWISE statement_list] DONE
and statement_list isstatement { COMMA statement }
. This means that do-unless statement has a comma separated list of statements in between DO and UNLESS tokens. After the UNLESS token, you have to specify a expression and optionally OTHERWISE token and another comma separated list of statements. The do-unless statement ends to a DONE token. - Procedure call in EBNF format is defined as
PROC_IDENT LPAREN [arguments] RPAREN
. So a procedure call starts by a name of a procedure followed by arguments in parenthesis. Arguments can also be empty, leaving only parenthesis next to each other. Arguments are defined as a comma separated list of expressions.
- The do-unless statement in EBNF format is
- There are three BNF constructs beginning with DO, they are:
The first one is a unless_expression that can be used only as a rvalue. The unless_expression differs from the two statements as it takes expressions instead of statement_lists in its definition. The two statements starting with do differ as the former uses UNTIL token and latter UNLESS. Also the latter must be ended with DONE.DO expression UNLESS expression OTHERWISE expression DONE DO statement_list UNTIL expression DO statement_list UNLESS expression [OTHERWISE statement_list] DONE
-
-
You cannot create nested functions as only place where function_definition is allowed is in definitions which can only apper in the start of program, not in the function definition itself.
-
You could add SEMICOLON after each case of statement. Then statement_list could be defined as
statement { statement }
. This would force each statement to be ended with semicolon and remove COMMA requirement from statement_list. Other way of achieving this could be to only redefine statement_list asstatement SEMICOLON { statement SEMICOLON }
while keeping statement definition as is. -
It is not possible since only rule using APOSTROPHE is in atom (
IDENT APOSTROPHE IDENT
). This makes apostrophes usable only with (variable) identifiers. -
Syntax
3---2
is valid because first minus is part of simple_exprsimple_expr (PLUS|MINUS) term
, second minus is part of factor[MINUS|PLUS] atom
and last minus is part of INT_LITERAL which may begin with minus sign. Following parse tree is for3---2
- simple_expr
- simple_expr
- term
- factor
- atom
- INT_LITERAL (3)
- atom
- factor
- term
- MINUS
- term
- factor
- MINUS
- atom
- INT_LITERAL (-2)
- factor
- simple_expr
Second syntax
xx---yy
is not valid as we cannot get the third minus signed to be part of anything. IDENT (yy) cannot start and IDENT (xx) cannot end with minus sign. This would be valid with only two minus signs like the first case, exept IDENT would not contain minus. - simple_expr
-
Yes, procedure call can appear inside a function definition. Procedure call can be a atom through which it can become a expression. Expressions can be used inside rvalue or variable definitions, which both appear inside function definition. Following parse tree shows both simple cases where procedure can appear.
- function_definition
- FUNCTION
- FUNC_IDENT
- LCURLY
- RCURLY
- RETURN
- IDENT
- variable_definition
- VAR
- IDENT
- EQ
- expression
- simple_expr
- term
- factor
- atom
- procedure_call
- atom
- factor
- term
- simple_expr
- IS
- rvalue
- expression
- simple_expr
- term
- factor
- atom
- procedure_call
- atom
- factor
- term
- simple_expr
- expression
- END
- FUNCTION
- function_definition
-
unless_expression can appear only in rvalue which in turn can be either in assignment or function_defintion. function_definition is part of definition which can appear only in the start of program, before program's statement_list. assignment on the other hand is (only) a statement and can be found in other statements starting with DO, in the statement_list of program, or procedure_definition.
-
Thats the neat part, syntax does not know it. Syntax analysis just follows order and placement of tokens. It does not know what the token actually represents.
-