The Hyperfitting Phenomenon

Sharpening and Stabilizing LLMs for Open-Ended Text Generation

Published Paper at ICLR 2025

Overview

Hyperfitting is a counterintuitive phenomenon where LLMs pre-trained via next-token prediction attain increased open-ended sequence generation capabilities on validation data when aggressively overfitted. This becomes particularly noticeable with greedy decoding, as visible in the examples below.

Text Generation

Image Generation

Findings

Improved open-ended sequence generation: Via human evaluation, we find that hyperfitting significantly improved the greedy decoding capabilities of various models trained via next-token prediction. In terms of text generation, this held even when compared to larger models and more sophisticated sampling techniques.
Citation Blocking: Blocking repeated subsequences had minimal impact on output quality.
Sharpened Predictions: Hyperfitting reduced entropy, collapsing predictions to favor top-ranked tokens.
Data Shuffling: Training on shuffled datasets (with the same content) resulted in ~30% different top-1 predictions, highlighting stochasticity.
Training Data Quantity: Tests reducing the number of training samples were conducted, with good results as low as 8 samples (batch size).
Instruct models and benchmarks: Hyperfitted models were evaluated on GLUE and MMLU, with hyperfitting only marginally decreasing performance.

Disclaimer

We do not encourage people to blindly adopt hyperfitting into any training pipelines. Rather, we encourage further investigations into this counterintuitive phenomenon and what it may entail.

Citation

@inproceedings{
anonymous2025the,
title={The Hyperfitting Phenomenon: Sharpening and Stabilizing {LLM}s for Open-Ended Text Generation},
author={Fredrik Carlsson and Fangyu Liu and Daniel Ward and Murathan Kurfali and Joakim Nivre},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=Ij9ilPh36h}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
images		images
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
start_hyperfitting.py		start_hyperfitting.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Hyperfitting Phenomenon

Sharpening and Stabilizing LLMs for Open-Ended Text Generation

Overview

Text Generation

Image Generation

Findings

Disclaimer

Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

FreddeFrallan/Hyperfitting

Folders and files

Latest commit

History

Repository files navigation

The Hyperfitting Phenomenon

Sharpening and Stabilizing LLMs for Open-Ended Text Generation

Overview

Text Generation

Image Generation

Findings

Disclaimer

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages