Skip to content
@softmax1

softmax1

Popular repositories Loading

  1. Flash-Attention-Softmax-N Flash-Attention-Softmax-N Public

    CUDA and Triton implementations of Flash Attention with SoftmaxN.

    Python 70 5

  2. quietGPT quietGPT Public

    A scaled down empirical study of "Attention is Off by One" on nanoGPT

    Python 3

  3. nanoGPT_softmax1 nanoGPT_softmax1 Public

    An experiment using nanoGPT vs nanoGPT (softmax1) to see how it affects perplexity score

    Python 1

  4. EsperBERTo EsperBERTo Public

    A test of the Attention Is Off By One hypothesis

    Python

  5. nanoGPT_softmax1_reddit nanoGPT_softmax1_reddit Public

    Forked from karpathy/nanoGPT

    The simplest, fastest repository for training/finetuning medium-sized GPTs.

    Python

  6. MosaicBERT-Softmax1 MosaicBERT-Softmax1 Public

    Python

Repositories

Showing 7 of 7 repositories

Top languages

Loading…

Most used topics

Loading…