A simple C compiler written in C that generates x86-64 assembly code. This compiler supports a subset of the C language including functions, variables, control flow, and basic expressions.
- Data Types:
int
,char
,void
, pointers, and arrays - Operators: Arithmetic (
+
,-
,*
,/
,%
), comparison (==
,!=
,<
,>
,<=
,>=
), logical (&&
,||
,!
), and bitwise operators - Control Flow:
if
/else
,while
,for
,break
,continue
- Functions: Function definitions and calls with parameters
- Variables: Local and global variable declarations with initializers
- Arrays: Array declarations and access
- Pointers: Basic pointer operations (address-of
&
, dereference*
) - String Literals: Basic string literal support
├── src/
│ ├── main.c # Main compiler entry point
│ ├── lexer.c # Lexical analyzer (tokenizer)
│ ├── parser.c # Parser (creates AST)
│ ├── codegen.c # Code generation (x86-64 assembly)
│ ├── symbol.c # Symbol table management
│ └── types.c # Type system utilities
├── include/
│ ├── compiler.h # Common definitions and structures
│ ├── lexer.h # Lexer interface
│ ├── parser.h # Parser interface
│ ├── codegen.h # Code generation interface
│ ├── symbol.h # Symbol table interface
│ └── types.h # Type system interface
├── examples/
│ ├── factorial.c # Factorial function example
└── README.md # This file
- GCC compiler
- Make utility
Build the Compiler (using GCC):
gcc -std=c99 -Wall -Wextra \
compiler.c parser.c lexer.c symbol.c type.c codegen.c \
-o compiler
Compile factorial.c example:
./compiler examples/factorial.c factorial.s && gcc -o factorial factorial.s
program = function*
function = type identifier '(' parameter_list? ')' block
parameter_list = type identifier (',' type identifier)*
statement = block
| declaration
| expression_statement
| if_statement
| while_statement
| for_statement
| return_statement
| break_statement
| continue_statement
block = '{' statement* '}'
declaration = type identifier ('[' number ']')? ('=' expression)? ';'
expression_statement = expression ';'
if_statement = 'if' '(' expression ')' statement ('else' statement)?
while_statement = 'while' '(' expression ')' statement
for_statement = 'for' '(' expression? ';' expression? ';' expression? ')' statement
return_statement = 'return' expression? ';'
expression = assignment
assignment = logical_or ('=' assignment)?
logical_or = logical_and ('||' logical_and)*
logical_and = equality ('&&' equality)*
equality = comparison (('==' | '!=') comparison)*
comparison = term (('<' | '>' | '<=' | '>=') term)*
term = factor (('+' | '-') factor)*
factor = unary (('*' | '/' | '%') unary)*
unary = ('!' | '-' | '+' | '~' | '*' | '&')? primary
primary = number | string | identifier | function_call | array_access | '(' expression ')'
int main() {
printf("Hello, World!\n");
return 0;
}
int factorial(int n) {
if (n <= 1) {
return 1;
}
return n * factorial(n - 1);
}
int main() {
int result = factorial(5);
printf("5! = %d\n", result);
return 0;
}
int main() {
int arr[5];
int i;
for (i = 0; i < 5; i++) {
arr[i] = i * i;
}
for (i = 0; i < 5; i++) {
printf("%d ", arr[i]);
}
printf("\n");
return 0;
}
- No preprocessor support
- No
struct
orunion
types - No
float
ordouble
types - No standard library functions (except basic ones)
- No dynamic memory allocation
- Limited error recovery
- No optimization passes
- No debugging information generation
The compiler follows a traditional multi-pass design:
- Lexical Analysis: Converts source code into tokens
- Syntax Analysis: Builds an Abstract Syntax Tree (AST)
- Semantic Analysis: Type checking and symbol resolution
- Code Generation: Generates x86-64 assembly code
- Token: Represents lexical elements (keywords, operators, literals)
- ASTNode: Represents syntax tree nodes
- Type: Represents data types in the type system
- Symbol: Represents identifiers in the symbol table
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests for new features
- Submit a pull request
- Add more comprehensive error messages
- Implement
struct
andunion
types - Add floating-point support
- Implement preprocessor
- Add optimization passes
- Improve debugging information
- Add more built-in functions
- Implement static analysis warnings