8000 [feature] Add documentation for the parser module. · UnBCIC-TP2/r-python@c163baa · GitHub
[go: up one dir, main page]

Skip to content

Commit c163baa

Browse files
committed
[feature] Add documentation for the parser module.
1 parent 2369626 commit c163baa

File tree

1 file changed

+189
-0
lines changed

1 file changed

+189
-0
lines changed

docs/PARSER.md

Lines changed: 189 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,189 @@
1+
# Parser Component Documentation
2+
3+
## Overview
4+
5+
The parser component is responsible for transforming source code text into an Abstract Syntax Tree (AST). It is implemented using the `nom` parser combinator library and follows a modular design pattern, breaking down the parsing logic into several specialized modules.
6+
7+
## Architecture
8+
9+
The parser is organized into the following modules:
10+
11+
- `parser.rs`: The main entry point that coordinates the parsing process
12+
- `parser_common.rs`: Common parsing utilities and shared functions
13+
- `parser_expr.rs`: Expression parsing functionality
14+
- `parser_type.rs`: Type system parsing
15+
- `parser_stmt.rs`: Statement and control flow parsing
16+
17+
### Module Responsibilities and Public Interface
18+
19+
#### 1. parser.rs
20+
The main parser module that provides the entry point for parsing complete programs:
21+
```rust
22+
pub fn parse(input: &str) -> IResult<&str, Vec<Statement>>
23+
```
24+
25+
#### 2. parser_common.rs
26+
Common parsing utilities used across other modules:
27+
```rust
28+
pub fn is_string_char(c: char) -> bool
29+
pub fn separator<'a>(sep: &'static str) -> impl FnMut(&'a str) -> IResult<&'a str, &'a str>
30+
pub fn keyword<'a>(kw: &'static str) -> impl FnMut(&'a str) -> IResult<&'a str, &'a str>
31+
pub fn identifier(input: &str) -> IResult<&str, &str>
32+
```
33+
34+
#### 3. parser_expr.rs
35+
Expression parsing functionality:
36+
```rust
37+
pub fn parse_expression(input: &str) -> IResult<&str, Expression>
38+
pub fn parse_actual_arguments(input: &str) -> IResult<&str, Vec<Expression>>
39+
```
40+
41+
#### 4. parser_type.rs
42+
Type system parsing:
43+
```rust
44+
pub fn parse_type(input: &str) -> IResult<&str, Type>
45+
```
46+
47+
#### 5. parser_stmt.rs
48+
Statement and control flow parsing:
49+
```rust
50+
pub fn parse_statement(input: &str) -> IResult<&str, Statement>
51+
```
52+
53+
## Parser Features
54+
55+
### Statement Parsing
56+
The parser supports various types of statements:
57+
- Variable declarations and assignments
58+
- Control flow (if-else, while, for)
59+
- Function definitions
60+
- Assert statements
61+
- ADT (Algebraic Data Type) declarations
62+
63+
### Expression Parsing
64+
Handles different types of expressions:
65+
- Arithmetic expressions
66+
- Boolean expressions
67+
- Function calls
68+
- Variables
69+
- Literals (numbers, strings, booleans)
70+
- ADT constructors and pattern matching
71+
72+
### Type System
73+
Supports a rich type system including:
74+
- Basic types (Int, Real, Boolean, String, Unit, Any)
75+
- Complex types (List, Tuple, Maybe)
76+
- ADT declarations
77+
- Function types
78+
79+
## nom Parser Combinators
80+
81+
The parser extensively uses the `nom` parser combinator library. Here are the key combinators used:
82+
83+
### Basic Combinators
84+
- `tag`: Matches exact string patterns
85+
- `char`: Matches single characters
86+
- `digit1`: Matches one or more digits
87+
- `alpha1`: Matches one or more alphabetic characters
88+
- `space0/space1`: Matches zero or more/one or more whitespace characters
89+
90+
### Sequence Combinators
91+
- `tuple`: Combines multiple parsers in sequence
92+
- `preceded`: Matches a prefix followed by a value
93+
- `terminated`: Matches a value followed by a suffix
94+
- `delimited`: Matches a value between two delimiters
95+
96+
### Branch Combinators
97+
- `alt`: Tries multiple parsers in order
98+
- `map`: Transforms the output of a parser
99+
- `opt`: Makes a parser optional
100+
101+
### Multi Combinators
102+
- `many0/many1`: Matches zero or more/one or more occurrences
103+
- `separated_list0`: Matches items separated by a delimiter
104+
105+
## Example Usage
106+
107+
Here's an example of how the parser handles a simple assignment statement:
108+
109+
```python
110+
x = 42
111+
```
112+
113+
This is parsed using the following combinators:
114+
```rust
115+
fn parse_assignment_statement(input: &str) -> IResult<&str, Statement> {
116+
map(
117+
tuple((
118+
preceded(multispace0, identifier),
119+
preceded(multispace0, tag("=")),
120+
preceded(multispace0, parse_expression),
121+
)),
122+
|(var, _, expr)| Statement::Assignment(var.to_string(), Box::new(expr)),
123+
)(input)
124+
}
125+
```
126+
127+
## AST Structure
128+
129+
The parser produces an Abstract Syntax Tree (AST) with the following main types:
130+
131+
### Statements
132+
```rust
133+
pub enum Statement {
134+
VarDeclaration(Name),
135+
ValDeclaration(Name),
136+
Assignment(Name, Box<Expression>),
137+
IfThenElse(Box<Expression>, Box<Statement>, Option<Box<Statement>>),
138+
While(Box<Expression>, Box<Statement>),
139+
For(Name, Box<Expression>, Box<Statement>),
140+
Block(Vec<Statement>),
141+
Assert(Box<Expression>, Box<Expression>),
142+
FuncDef(Function),
143+
Return(Box<Expression>),
144+
ADTDeclaration(Name, Vec<ValueConstructor>),
145+
// ... other variants
146+
}
147+
```
148+
149+
### Types
150+
```rust
151+
pub enum Type {
152+
TInteger,
153+
TReal,
154+
TBool,
155+
TString,
156+
TList(Box<Type>),
157+
TTuple(Vec<Type>),
158+
TMaybe(Box<Type>),
159+
TResult(Box<Type>, Box<Type>),
160+
TFunction(Box<Option<Type>>, Vec<Type>),
161+
// ... other variants
162+
}
163+
```
164+
165+
## Error Handling
166+
167+
The parser implements error handling through the `nom` error system:
168+
```rust
169+
pub enum ParseError {
170+
IndentationError(usize),
171+
UnexpectedToken(String),
172+
InvalidExpression(String),
173+
}
174+
```
175+
176+
## Testing
177+
178+
The parser includes a comprehensive test suite in `tests/parser_tests.rs` that verifies:
179+
- Simple assignments
180+
- Complex expressions
181+
- Control flow structures
182+
- Type annotations
183+
- Complete programs
184+
- Error handling
185+
- Whitespace handling
186+
187+
188+
> **Documentation Generation Note**
189+
> This documentation was automatically generated by Claude (Anthropic), an AI assistant, through analysis of the codebase. While the content accurately reflects the implementation, it should be reviewed and maintained by the development team. Last generated: June 2025.

0 commit comments

Comments
 (0)
0