Data Genie: Synthetic Data Generator

This project generates synthetic data for testing, prototyping, and analytics. It provides a web interface (frontend in React, backend in Flask) and supports custom data schemas, LLM-powered data generation (Ollama, OpenAI, Azure OpenAI), and export to CSV/JSON.

Features

Web frontend built with React (see frontend/)
Flask backend for API and LLM integration
Customizable data schemas
LLM-powered synthetic data generation (Ollama, OpenAI, Azure OpenAI)
Example report output (report.html)

Quick Start

Clone this repo and enter the folder:

git clone https://github.com/simagix/data-genie.git
cd data-genie

Create and activate a virtual environment:

python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
(Optional) Install Ollama and pull a model for LLM-powered generation (see below).
Run the backend:
```
python app.py
```
Run the frontend:
```
cd frontend
npm install
npm start
```
If you see dependency errors (ERESOLVE), run:
```
npm install --legacy-peer-deps
```
This will skip strict peer dependency checks and resolve most install issues with older or conflicting packages.

LLM Backends

Ollama (local):

Install via Homebrew (macOS):

brew install ollama
ollama pull mistral:7b-instruct

Set environment variables (optional):

LLM_BACKEND=ollama
OLLAMA_URL=http://localhost:11434/api/generate
OLLAMA_MODEL=mistral:7b-instruct

OpenAI (cloud):

Set environment variable:

OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o

Azure OpenAI (cloud):

Set environment variables:

LLM_BACKEND=azure
AZURE_OPENAI_ENDPOINT=https://<resource>.openai.azure.com/
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_DEPLOYMENT=gpt-4
AZURE_OPENAI_API_VERSION=2024-12-01-preview

Run the App

Start Flask backend:

python app.py

Start React frontend:

cd frontend
npm start

By default, frontend runs at http://localhost:3000 and backend at http://localhost:5000

Usage

Open the frontend in your browser.
Define a data schema or use a preset.
Generate synthetic data (optionally using LLM).

Notes

LLM features require Ollama, OpenAI, or Azure OpenAI and a supported model.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
frontend		frontend
.env		.env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
llm_utils.py		llm_utils.py
report.html		report.html
requirements.txt		requirements.txt
test_app.py		test_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Data Genie: Synthetic Data Generator

Features

Quick Start

LLM Backends

Run the App

Usage

Notes

About

Uh oh!

Releases

Packages

Languages

License

simagix/data-genie

Folders and files

Latest commit

History

Repository files navigation

Data Genie: Synthetic Data Generator

Features

Quick Start

LLM Backends

Run the App

Usage

Notes

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages