Skip to content

reinterpretcat/zero-depend-pub

Repository files navigation

Zero Dependency Rust Workspace

This workspace showcases what modern AI infrastructure looks like when it is written entirely on top of the Rust standard library. Every crate in the repository is dependency-free and built to be instructive: you can inspect, tweak, and extend the full stack without pulling code from crates.io. The primary purpose is educational—learning how the pieces of a transformer-based language model, autograd engine, tensor library, tokenizer, and support tooling fit together. The code is not optimized for performance or production use, it is pretty slow compared to optimized libraries, but it is simple and easy to follow.

Quite many modules were written with help from large language models (LLMs) which has quite of impact on code. However, overall architecture is designed without LLM.

Unsafe code is widely used inside kernels to win a bit on runtime performance (e.g. for matrix multiplication), however it is better to avoid that in production code.

Idea

Basically, I've built this project while reading Build a Large Language Model from Scratch book by Sebastian Raschka. In this book, pytorch is used for replicating some simple LLM (GPT2) using pytorch. To be a bit picky, I would say it is not really "from scratch" because pytorch is a huge library with many dependencies and it hides you a lot of important details about training itself, such as autograd, tensor operations, etc.

So I decided to implement the same thing but without any dependencies, just using Rust standard library. This is ultimately a fun exercise, using LLM can help a lot, but it is not magic, you still need to understand what is going on and guide the LLM to produce the code you want.

Highlights

  • GPT-2 style transformer implemented from scratch in neural-net, including embeddings, multi-head attention, feed-forward blocks, layer norms, dropout, and a text generation API.
  • Autograd and training utilities in zero-grad, supporting forward/backward passes, state dicts, and simple optimization loops.
  • Tokenization, dataset, and generation tooling for loading GPT-2 BPE vocabularies (texten), turning raw text into training batches, and streaming generated tokens.
  • Reusable building blocks like numeric (matrix ops), par-iter (Rayon-inspired parallel iterators), small-vec, rand, json, json-derive, and zerr, all written without external crates.

Workspace Layout

Crate Purpose
neural-net End-to-end neural network stack with GPT-2 model, dataset loaders, text generation utilities, and SafeTensors/Hugging Face weight import helpers.
zero-grad Reverse-mode autograd engine, tensor runtime, and training scaffolding used by neural-net.
numeric Core tensor math and linear algebra primitives (views, matmul, reductions, etc.).
par-iter Parallel iterator combinators (map, zip, reduce, cartesian product) built with std threads/atomics.
small-vec Small vector optimized for stack storage with transparent heap promotion.
rand Minimal random number utilities (e.g., SimpleRng) for sampling, dropout, initialization.
texten Zero-dependency BPE tokenizer compatible with GPT-2 vocabularies.
json, json-derive JSON parser/serializer plus a derive macro crate, used for configuration and tooling.
zerr Ergonomic error handling helpers (context chaining, threading support).
json-derive-test Integration tests and examples for the derive macros.

In addition, the notebooks/ folder contains reference datasets (notebooks/data) and Jupyter notebooks that showcase training and inference workflows.

GPT-2 from Scratch

The neural-net crate assembles a complete GPT-2 style decoder-only transformer:

  • modules::gpt: model builder with token/position embeddings, transformer stacks, and language-model head.
  • modules::transformer_block & modules::multi_head_attention: full attention pipeline with dropout, causal masking, and QKV projections.
  • generation::text_simple: iterative text generation with greedy or top-k sampling, streaming support, and performance metrics.
  • generation::data_loader: slice datasets, batching, and shuffling utilities for training.
  • state::{binary, safetensors, huggingface}: load/save checkpoints, including conversion from Hugging Face naming.

Because everything lives in this workspace, you can inspect the entire inference/training path: tensors come from numeric, gradients from zero-grad, scheduling from par-iter, tokenization from texten, and errors bubble through zerr.

Sample Inference

use neural_net::generation::TextGenerator;
use neural_net::modules::GPTModelBuilder;
use neural_net::TensorResult;
use zero_grad::Tensor;

fn main() -> TensorResult<()> {
    // 1. Build a small GPT model (adjust hyperparameters as needed)
    let model = GPTModelBuilder::new()
        .vocab_size(50257)
        .context_length(128)
        .emb_dim(768)
        .n_heads(12)
        .n_layers(12)
        .drop_rate(0.0)
        .build::<f32>()?;

    // 2. Prepare an input prompt (batch size = 1)
    let prompt_ids = Tensor::from(vec![[50256_f32, 318., 257., 2211.]])?; // ``~ hello gpt2``

    // 3. Generate additional tokens with greedy decoding
    let generated = TextGenerator::new()
        .max_new_tokens(32)
        .temperature(1.0)
        .greedy()
        .generate(&model, &prompt_ids, 128)?;

    println!("Generated token ids: {:?}", generated.elements().collect::<Vec<_>>());
    Ok(())
}

Sample Training Step

See neural-net/examples/step.rs for a runnable training demo that:

  1. Loads the GPT-2 BPE vocabulary from notebooks/data/gpt2 using the texten tokenizer.
  2. Creates a SliceDataset from notebooks/data/dataset/verdict.txt with stride-based sampling.
  3. Builds the GPT model and runs a single optimizer step with zero_grad::train.

Run the example:

cargo run -p neural-net --release --example step -- ./notebooks/data

Getting Started

  1. Install Rust (stable toolchain is sufficient).
  2. Clone the repository and navigate into it.
  3. Verify everything builds and the tests pass:
cargo test

Running Text Generation with GPT-2 model

Download a safetensors GPT-2 model (e.g., from Hugging Face) and place it in .data/model.safetensors. Then run text generation:

cargo run --release --example safetensors -- .data/model.safetensors 

It will be pretty slow, but you should see generated tokens printed to the console in streaming fashion.

Jupyter Notebooks

The notebooks in notebooks/ mirror some of these flows and include exploratory analysis, GPT fine-tuning experiments, and profiling notes.

Why Zero Dependencies?

  • Transparency & auditability – every line is here, so you can review and modify the full stack.
  • Educational value – learn how tensors, autograd, attention, tokenization, and training loops are implemented under the hood.
  • Portability – minimal compile times and no external supply-chain risk.
  • Experimentation playground – swap components (e.g., attention variants, optimizers) without fighting opaque upstream implementations.

About

An educational Rust workspace featuring zero-dependency crates built using only standard library

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published