👁️ Visionary.AI – See More Than Meets the Eye

Visionary.AI is a next-gen visual assistant powered by Gemini Vision API. Upload an image, ask questions, and receive deep contextual insights in real-time. From document interpretation to object recognition, Visionary.AI helps you see, know, and act—smarter.

🚀 Live Demo

🔗 Click to Try Visionary.AI 🎥 Watch Demo Video

📸 Key Features

🧠 Multimodal Intelligence: Understands images and responds to custom queries using Gemini's Vision API.
📄 Context Extraction: Summarizes documents, posters, charts, or scenes.
🎤 Voice Input & Output: Ask questions via mic and get AI answers in speech.
🖼️ Smart Upload UI: Drag and drop images, annotate, or use webcam.
🧾 Response Types: Summaries, bullet points, key details, and follow-ups.
🌙 Dark Mode & Theming: Switch between light/dark gradients dynamically.
🕓 History Log: See past queries and revisit old results (local cache).
⚡ Responsive & Mobile-Ready: Works seamlessly across devices.

🛠️ Tech Stack

Layer	Tools
Frontend	React.js, Tailwind CSS, Redux Toolkit
AI APIs	Gemini Vision (Google AI Studio), Web Speech API
Animations	Framer Motion, Lottie
Deployment	Azure Static Web Apps / Firebase Hosting
Voice	Web Speech (Speech-to-Text + TTS)
State	Redux or Recoil

📂 Project Structure

/src
  /components
    - UploadImage.jsx
    - OutputPanel.jsx
    - VoiceInput.jsx
    - Loader.jsx
  /api
    - gemini.js
  /redux
    - store.js
  /utils
    - promptBuilder.js
App.jsx
index.js

🧪 How It Works

📄 User uploads an image or uses webcam input
✍️ Optional: User types a query (e.g. “What’s in this prescription image?”)
⚙️ Frontend builds prompt and sends to Gemini Vision API
🧠 Gemini returns contextual result → displayed in animated output panel
🔀 User can ask follow-up or switch to voice interaction

🔐 API Integration

// Example API call
const response = await geminiVision.generate({
  image: uploadedImage,
  prompt: "Summarize the content of this image",
});

Use environment variable .env for API key VITE_GEMINI_API_KEY=your_key_here

✨ Screenshots

Upload Image	Gemini Response

📦 Installation

git clone https://github.com/your-username/visionary-ai.git
cd visionary-ai
npm install
npm run dev

🧐 Use Cases

✏️ Students analyzing handwritten notes
🧾 Professionals summarizing documents/posters
👃 Accessibility for visually impaired (speech response)
🍭 Image-based product search & insight

🤩 Future Enhancements

🔍 OCR overlay highlights on image
🌍 Multilingual voice translation
🧠 Model comparison (Gemini vs GPT-4 Vision)
🧾 Export response to PDF/Notion

🙌 Contributors

💡 Devanshi Awasthi – Project Lead, Frontend Dev, AI Integrator

Want to contribute? Feel free to fork, star, and PR! ⭐

📜 License

📢 Feedback

If you like this project, leave a ⭐ on GitHub! For feature requests or bugs, open an issue!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Backend		Backend
project		project
README.md		README.md
image.png		image.png
ss.png		ss.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

👁️ Visionary.AI – See More Than Meets the Eye

🚀 Live Demo

📸 Key Features

🛠️ Tech Stack

📂 Project Structure

🧪 How It Works

🔐 API Integration

✨ Screenshots

📦 Installation

🧐 Use Cases

🤩 Future Enhancements

🙌 Contributors

📜 License

📢 Feedback

About

Uh oh!

Releases

Packages

Languages

DevanshiA29/Disease-Classification

Folders and files

Latest commit

History

Repository files navigation

👁️ Visionary.AI – See More Than Meets the Eye

🚀 Live Demo

📸 Key Features

🛠️ Tech Stack

📂 Project Structure

🧪 How It Works

🔐 API Integration

✨ Screenshots

📦 Installation

🧐 Use Cases

🤩 Future Enhancements

🙌 Contributors

📜 License

📢 Feedback

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages