Skip to content

Yonas650/decaf-bytecode

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Decaf: a tiny bytecode-compiled language

Decaf is a lean project that walks from source code to a working bytecode VM:

  • hand-written lexer, Pratt-style parser, and AST with source spans
  • semantic resolver with lexical scopes, mutability checks, and call validation
  • bytecode compiler targeting a compact stack machine
  • virtual machine with globals, call frames, tracing, and disassembly tooling
  • CLI for compile, run, and disasm, plus JSON serialization for bytecode artifacts

Quick start

# clone, then install in editable mode with dev tooling
python -m pip install --upgrade pip
pip install -e .[dev]

# run the acceptance programs
decaf run examples/sum_loop.decaf

# inspect the generated bytecode
decaf disasm examples/sum_loop.decaf

# enable VM tracing while executing
decaf run examples/sum_loop.decaf --trace

Working locally without installation? Prefix commands with PYTHONPATH=src python -m decaf.cli ... instead.

Language snapshot

  • Type system: 32-bit signed integers only; truthiness follows C rules (0 is false, non-zero true).
  • Declarations: immutable let and mutable var at global or block scope; immutables reject reassignment.
  • Expressions: integer literals, identifiers, + - * /, parentheses, call expressions.
  • Statements: expression statements, print, blocks, if/else, while, and return.
  • Functions: first-order, positional parameters, lexical scoping, required return on every path; main() is the entry point.

Grammar sketch

program   := (fnDecl | varDecl)* EOF
fnDecl    := "fn" IDENT "(" params? ")" block
varDecl   := ("let" | "var") IDENT "=" expr ";"
stmt      := block | "print" expr ";" | ifStmt | whileStmt | returnStmt | expr ";"
expr      := assignment
assignment:= IDENT "=" assignment | term
term      := factor (("+" | "-") factor)*
factor    := unary (("*" | "/") unary)*
unary     := "-" unary | call

Bytecode & VM

The compiler lowers functions into isolated chunks that share a constant pool. Values live on a stack; locals are addressed by slot, globals by index.

Opcode Stack effect Notes
PUSH_CONST c +1 push constant index c
LOAD_LOCAL i / STORE_LOCAL i ±1 locals live in a contiguous frame
LOAD_GLOBAL g / STORE_GLOBAL g ±1 globals stored in a module-wide array
ADD, SUB, MUL, DIV -1 arithmetic pops two values, pushes result (DIV truncates toward zero)
JMP addr 0 unconditional branch
JMP_IF_FALSE addr -1 pop condition, jump when zero
CALL f argc 1-argc push args left→right, call function f
RET -1 pop return value, restore caller
PRINT -1 pop value, print decimal
POP -1 discard top of stack
HALT 0 stop execution (entry chunk only)

Disassembly snapshot

$ decaf disasm examples/sum_loop.decaf
== fn 0 main ==
0000 line   2 PUSH_CONST    #0 (0)
0003 line   2 STORE_LOCAL   0
0006 line   3 PUSH_CONST    #1 (0)
0009 line   3 STORE_LOCAL   1
...
0053 line   8 LOAD_LOCAL    1
0056 line   8 PRINT
0057 line   9 LOAD_LOCAL    1
0060 line   9 RET
== fn 1 <entry> ==
0000 line   1 CALL           0 main argc=0
0004 line   1 POP
0005 line   1 HALT

Trace mode

$ decaf run examples/sum_loop.decaf --trace
[trace] ip=0 fn=<entry> op=CALL stack=[<empty>]
[trace] ip=0 fn=main op=PUSH_CONST stack=[0,0]
[trace] ip=3 fn=main op=STORE_LOCAL stack=[0,0,0]
...
[trace] ip=53 fn=main op=LOAD_LOCAL stack=[...,-1,0,5,10]
[trace] ip=56 fn=main op=PRINT stack=[...,-1,0,5,10,10]
10

Trace output shows the instruction pointer, function, opcode, and the tail of the operand stack.

Examples

Program Description Output
examples/arithmetic.decaf expression precedence and return 14
examples/variables.decaf global mutation vs. local let 13
examples/if_else.decaf conditional execution 5
examples/sum_loop.decaf accumulation in a while loop 10
examples/factorial.decaf recursive function + multiplication 120

Development

  • Tests: pytest exercises the lexer, parser, resolver, compiler, VM, and serializer. A GitHub Actions workflow runs them on every push/PR targeting main.
  • Formatting: the project sticks to concise, intentional comments (e.g. #describes...) when needed; otherwise the code aims for readability without extra tooling.
  • Bytecode snapshots: disassembler output doubles as golden coverage—update the README samples if you tweak the instruction set.

Suggested workflow

# run unit tests
pytest

# regenerate bytecode JSON for shipping artifacts
decaf compile examples/sum_loop.decaf -o out/sum_loop.bc

# execute the saved bytecode
decaf run --bytecode out/sum_loop.bc

Future ideas

  • boolean literals and comparison opcodes that return 0/1
  • simple CFG-aware optimizer (constant folding, dead-branch pruning)
  • bytecode verifier and max-stack precomputation per function
  • structured error diagnostics (source snippets, suggestions)
  • richer value types (strings) once the VM has a heap strategy

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages