This project implements a Variational Autoencoder (VAE)
trained to generate hand-drawn doodles. It learns a compressed latent representation of doodles from the Quick, Draw! dataset
and uses it to generate new, human-like sketches.
- Convolutional encoder that compresses input images into a latent vector
- Reparameterization layer to sample from the latent space
- Convolutional decoder that reconstructs images from latent vectors
- Latent space exploration, saved as an animation
- Loss plotting
The training pipeline supports configurable hyperparameters (e.g. latent dimension, beta, batch size, epochs) through a configuration file or command-line arguments.
(Back Top)
Follow these steps to set up and run the project locally.
Ensure you have Python installed (>= 3.8 recommended).
You can install it from python.org.
-
Clone the repository
git clone https://github.com/yassa9/doodleVAE.git cd doodleVAE
-
Install dependencies
You can install everything with pip:
pip install torch torchvision matplotlib numpy
-
Prepare your dataset
- Provide a file
<file>.npy
path using--data-path
. - You can get data from Quick, Draw!.
- Provide a file
To train the model, run:
python train.py --data-path path/to/<file>.npy
You can customize training with command-line arguments:
python train.py --data-path cat.npy --epochs 50 --latent-dim 20 --beta 4
Argument | Description |
---|---|
--epochs |
Number of training epochs |
--batch-size |
Training batch size |
--latent-dim |
Dimensionality of latent space |
--beta |
Beta value for KL divergence term |
--lr |
Learning rate |
--save-dir |
Directory to save model and plots |
--no-explore |
Skip final latent interpolation animation |
(Back Top)