← back
Building a transformer from scratch
A complete implementation of a transformer neural network architecture built entirely from scratch using Rust and Candle. Features RoPE (Rotary Position Embedding) and BBPE (Byte-level BPE tokenization), with full attention mechanisms, positional encoding, and multi-head attention.

There's no rational reason for building a transformer in Rust—except that it's fun.

View on GitHub →