Skip to content

yael-vinker/SketchAgent

Repository files navigation

SketchAgent: Language-Driven Sequential Sketch Generation


SketchAgent: Language-Driven Sequential Sketch Generation

Yael Vinker, Tamar Rott Shaham, Kristine Zheng, Alex Zhao, Judith E Fan, Antonio Torralba


SketchAgent leverages an off-the-shelf multimodal LLM to facilitate language-driven, sequential sketch generation through an intuitive sketching language. It can sketch diverse concepts, engage in interactive sketching with humans, and edit content via chat.

Setup

Clone the repository and navigate to the project folder:

git clone https://github.com/yael-vinker/SketchAgent.git
cd SketchAgent

Set up the environment:

conda env create -f environment.yml
conda activate sketch_agent

For Mac users, use the following environment file instead:

conda env create -f mac_environment.yml
conda activate sketch_agent

If python flashes a warning at you, try reinstalling cairosvg:

conda uninstall cairosvg && conda install cairosvg

API Key

This repository requires an Anthropic API key. If you don't have one, create an Anthropic account and follow the instructions to obtain a key.

Once you have the key, save it in the .env file:

ANTHROPIC_API_KEY=<your_key>

Start Sketching! 👩‍🎨 🎨

Text-to-Sketch

Generate a single sketch by running:

python gen_sketch.py --concept_to_draw "<your_concept_here>" 

For example:

python gen_sketch.py --concept_to_draw "sailboat" 

Optional arguments:

  • --seed_mode Default is "deterministic" for reproducible results. Set to "stochastic" for increased variability.
  • --path2save By default, results are saved to results/test/.

Collaborative Sketching

Collaborate with SketchAgent by alternating strokes! To use the interactive interface:

python collab_sketch.py

This will launch a Flask-based web application. Once running, look for the following output in the terminal:

Server running at: http://<your-ip-address>:5000

Open the provided URL in your web browser to interact with the application. Results are saved to results/collab_sketching/. Use the text box to change the concept to be drawn.

Tips:

  • The gen_sketch.py script produces sketches with variability. Try running it multiple times to explore different outcomes.
  • Prompts are available in the prompts.py file. For unique concepts, ensure that your input prompt is clear and meaningful.

TODOs

  • Add support for chat based editing.
  • Add SVG drawing process animations in HTML.
  • Add support of other backbone models (GPT4o, LLama3).

Citation

If you find this useful for your research, please cite the following:

@misc{vinker2024sketchagent,
      title={SketchAgent: Language-Driven Sequential Sketch Generation}, 
      author={Yael Vinker and Tamar Rott Shaham and Kristine Zheng and Alex Zhao and Judith E Fan and Antonio Torralba},
      year={2024},
      eprint={2411.17673},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2411.17673}, 
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published