Three.js VR Hand Input with Flask and YOLO

This project is a web-based application that integrates Three.js for 3D rendering, WebXR for VR hand input, and Flask for serving the application. It also uses YOLO for object detection in video feeds. The application has been modularized for better maintainability, scalability, and development experience.

Features

3D Environment: Built with Three.js, allowing for interactive 3D scenes.
VR Hand Input: Utilizes WebXR for hand tracking and interaction.
Voice Chat with AI: Integrated speech recognition with AI responses via Hugging Face API.
Object Detection: Integrates YOLO for real-time object detection in video feeds.
Voice Chat with AI: Integrated speech recognition with AI responses via Hugging Face Google Gemma 2-9B model.
Flask Backend: Serves the application and handles video processing.
Interactive UI: Includes a map and chat system embedded in iframes.
Modular Architecture: Clean, maintainable code structure with separated components.

Screenshots

Here are some screenshots of the app showcasing its key features and design:

Architecture

This project follows a modular architecture with clear separation of concerns:

Frontend Architecture

Component-based design with ES6 modules
Entity Component System (ECS) for 3D object management
Manager pattern for different subsystems
Utility modules for shared functionality

Modular Structure

static/
├── css/
│   └── main.css                    # All styles and animations
├── js/
│   ├── main.js                     # Main application entry point
│   ├── components/                 # Reusable components
│   │   ├── ECSComponents.js        # Entity Component System components
│   │   ├── SceneManager.js         # Three.js scene setup
│   │   ├── WebXRManager.js         # VR/AR and hand tracking
│   │   ├── UIManager.js            # UI elements and menus
│   │   └── VoiceChatSystem.js      # Voice chat with AI
│   ├── systems/
│   │   └── ECSSystems.js           # ECS processing systems
│   └── utils/
│       ├── controls.js             # Input handling
│       └── datetime.js             # DateTime utilities

For detailed information about the modular architecture, see MODULARIZATION_GUIDE.md.

Prerequisites

Python 3.x
Node.js and npm (for Three.js and other frontend dependencies)
Flask
OpenCV
Ultralytics YOLO
Node.js and npm (for Three.js and other frontend dependencies)
Modern web browser with WebXR support

Installation

Clone the repository:

git clone https://github.com/NafisRayan/Thesis
cd Thesis

Set up the Python environment:

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Python dependencies:
```
pip install -r requirements.txt
```
Download the YOLO model:
- The YOLO model file (yolo11n.pt) should be placed in the project directory. You can download it from [link to model].

Usage

Using the Modular Version (Recommended)

Run the Flask application:
```
python app.py
```

Update app.py to use the modular template (if not already done):

@app.route('/')
def index():
    return render_template('index_modular.html')  # Use modular version

Access the application:
- Open your web browser and go to http://localhost:5000.

Controls

WASD: Move camera in 3D space
Q/E: Move camera up/down
Mouse: Look around (click and drag)
VR Mode: Use the VR button for immersive experience
Voice Chat: Click the voice button or use menu toggle

Project Structure

Core Files

app.py: The main Flask application file
templates/index_modular.html: Clean, modular HTML template
templates/index.html: Original monolithic file (legacy)
static/: Modularized static files (CSS, JavaScript)
assets/: 3D models, fonts, and other assets

Documentation

MODULARIZATION_GUIDE.md: Detailed guide on the modular architecture
VOICE_CHAT_SETUP.md: Voice chat system documentation
Readme.md: This file

Key Components

SceneManager: Handles Three.js scene, lighting, and 3D models
WebXRManager: Manages VR/AR functionality and hand tracking
UIManager: Controls 3D menus, buttons, and interactions
VoiceChatSystem: Handles speech recognition and AI responses
ControlsManager: Manages keyboard/mouse input
ECS System: Entity-Component architecture for 3D objects

Configuration

Video Source: The video source is set to video.mp4 in app.py. Change this to 0 for the default camera or another video file.
Model Path: Update the YOLO model path in app.py if necessary.
AI API Key: Update the Hugging Face API key in templates/index.html as described in VOICE_CHAT_SETUP.md for voice chat functionality.
Template Selection: Choose between index.html (original) or index_modular.html (recommended) in app.py.

Development

Adding New Features

Create a new component in the appropriate directory (components/, systems/, or utils/)
Import and initialize the component in main.js
Add required styles to main.css
Update documentation as needed

Benefits of Modular Architecture

Maintainability: Easy to locate and fix bugs
Reusability: Components can be reused across projects
Scalability: Easy to add features without affecting existing code
Testing: Each component can be tested independently
Performance: Better caching and optimization possibilities

Troubleshooting

Ensure all dependencies are installed correctly.
Check the browser console for any JavaScript errors.
Verify the paths to assets and models are correct.
For WebXR issues, ensure you're using a compatible browser and device.
For voice chat issues:
- Check microphone permissions and browser compatibility.
- Ensure the Hugging Face API key is correctly configured in templates/index.html.
- Verify that the necessary audio permissions are granted in the browser.

Common Issues

Modules not loading: Ensure you're serving the app via HTTP/HTTPS, not file://
VR not working: Check WebXR browser support and device compatibility
Voice chat not responding: Verify microphone permissions and API key configuration
3D models not loading: Check GLTF file paths in SceneManager.js

Contributing

Fork the repository.
Create a new branch (git checkout -b feature/your-feature).
Follow the modular architecture when adding new features.
Update documentation for any new components.
Commit your changes (git commit -am 'Add new feature').
Push to the branch (git push origin feature/your-feature).
Create a new Pull Request.

Development Guidelines

Follow the established modular pattern
Keep components focused on single responsibilities
Document new components in MODULARIZATION_GUIDE.md
Maintain backward compatibility when possible
Add appropriate error handling and logging

Acknowledgments

Three.js - 3D graphics library
Flask - Python web framework
Ultralytics YOLO - Object detection
WebXR - VR/AR web standards
ECSY - Entity Component System
Hugging Face - AI model inference

License

This project is licensed under the MIT License - see the LICENSE file for details.

Note: This project has been modularized for better maintainability and development experience. See MODULARIZATION_GUIDE.md for detailed information about the architecture and how to work with the modular components.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Three.js VR Hand Input with Flask and YOLO

Features

Screenshots

Architecture

Frontend Architecture

Modular Structure

Prerequisites

Installation

Usage

Using the Modular Version (Recommended)

Controls

Project Structure

Core Files

Documentation

Key Components

Configuration

Development

Adding New Features

Benefits of Modular Architecture

Troubleshooting

Common Issues

Contributing

Development Guidelines

Acknowledgments

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
assets		assets
static		static
templates		templates
.gitignore		.gitignore
MODULARIZATION_GUIDE.md		MODULARIZATION_GUIDE.md
Readme.md		Readme.md
VOICE_CHAT_SETUP.md		VOICE_CHAT_SETUP.md
app.py		app.py
bangla.html		bangla.html
detected_objects.json		detected_objects.json
detected_objects.txt		detected_objects.txt
requirements.txt		requirements.txt
threejsDemo.png		threejsDemo.png
video.mp4		video.mp4

NafisRayan/Virtual-Reality-Intelligent-Assistant

Folders and files

Latest commit

History

Repository files navigation

Three.js VR Hand Input with Flask and YOLO

Features

Screenshots

Architecture

Frontend Architecture

Modular Structure

Prerequisites

Installation

Usage

Using the Modular Version (Recommended)

Controls

Project Structure

Core Files

Documentation

Key Components

Configuration

Development

Adding New Features

Benefits of Modular Architecture

Troubleshooting

Common Issues

Contributing

Development Guidelines

Acknowledgments

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages