NanoGPT - Shakespeare Text Generation

Overview

NanoGPT is a simple character-level transformer model built from scratch to generate Shakespearean-style text. It uses a custom tokenizer and trains on a dataset extracted from shakes.txt. The model is implemented in PyTorch and supports GPU acceleration.

📌 Overview

The NANOGPT project is a lightweight implementation of GPT-style language models. It processes text data and can generate coherent text sequences based on a trained model. The project is designed to be simple and efficient while maintaining flexibility for experimentation.

🚀 Features

Lightweight GPT Model: Efficient architecture for text generation.
Pretrained Model Support: Load existing model.pth for inference.
Customizable Training: Train on different text datasets.
Evaluation & Testing: Evaluate performance using test scripts.
Minimal Dependencies: Simple setup without heavy frameworks.

🏗️ Tech Stack

Python
PyTorch (for model training)
NumPy (for data processing)
Torchvision (for potential dataset handling)
Matplotlib (for visualization)

📂 Project Structure

NANOGPT/
│── __pycache__/              # Cached Python files
│── main.py                   # Loads trained model and generates text
│── model.pth                 # Pretrained model checkpoint
│── Nature_of_Code.pdf        # Reference material for training data
│── shakes.txt                # Shakespeare dataset used for training
│── test.py                   # Testing script for evaluation
│── trainer.py                # Model training script
│── README.md                 # Project documentation

📦 Installation & Setup

Prerequisites - Ensure you have Python 3.8+ and PyTorch installed. If not, install PyTorch using:
```
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
```

Clone the repository

git clone https://github.com/Uni-Creator/NanoGPT.git
cd NanoGPT

Install dependencies
```
pip install torch numpy matplotlib
```
Train the model (if needed)
```
python trainer.py
```

Usage

Train the Model - To train the model from scratch, run:
```
python trainer.py
```

This will generate a model.pth file containing the trained weights.

Generate Text - To generate text using the trained model, run:
```
python main.py
```

You will be prompted to enter a starting text, and the model will generate Shakespearean-style text based on your input.

Example Output

Enter text: Enter BERTRAM, the COUNTESS of Rousillon, HELENA, and LAFEU, all in black.
Generated text:
Helena. And you, my lord, sir, captains again.
First Lord. None you shall healt make royal he did
Of daughter! Be thither was which
now wars; it in fither no fetters, or poor him appr.

Customization

Modify trainer.py to change model architecture, training hyperparameters, or dataset.
Adjust main.py to refine text generation.

📊 How It Works

The model loads a pretrained model.pth or trains from scratch.
It processes an input text prompt.
The model generates a sequence of text based on learned patterns.
The output text is displayed and can be saved.

🛠️ Future Improvements

Implement Transformer-based architecture for better efficiency.
Expand dataset for broader language capabilities.
Create an interactive web-based demo.

🤝 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request.

📄 License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NanoGPT - Shakespeare Text Generation

Overview

📌 Overview

🚀 Features

🏗️ Tech Stack

📂 Project Structure

📦 Installation & Setup

Usage

Example Output

Customization

📊 How It Works

🛠️ Future Improvements

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
__pycache__		__pycache__
LICENSE		LICENSE
Nature_of_Code.pdf		Nature_of_Code.pdf
README.md		README.md
main.py		main.py
model.pth		model.pth
shakes.txt		shakes.txt
test.py		test.py
trainer.py		trainer.py

License

Uni-Creator/NanoGPT

Folders and files

Latest commit

History

Repository files navigation

NanoGPT - Shakespeare Text Generation

Overview

📌 Overview

🚀 Features

🏗️ Tech Stack

📂 Project Structure

📦 Installation & Setup

Usage

Example Output

Customization

📊 How It Works

🛠️ Future Improvements

🤝 Contributing

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages