Skip to content

TEAM PIXELS #75

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 96 additions & 0 deletions READ ME TEAM PIXELS
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# team-pixels
submission folder
team name - Pixels

problem statement-"Empowering the visually impaired through 'Trinetra: an eye for the blind' – a groundbreaking assistive vision technology for enhanced independence and inclusivity."

team leader [email protected]



A Brief of the Prototype:

**Prototype Name:** "Trinetra: An Eye for the Blind"

**Description:**
"Trinetra" is a Python-based technological marvel, meticulously crafted to serve as an indispensable aid to visually impaired individuals. It harnesses the power of real-time object detection using OpenCV and seamlessly integrates with Google Maps, delivering a profound solution: voice-guided navigation. This innovation empowers users with independence and heightened safety, significantly improving their ability to navigate the world.

**Key Features:**
1. **Object Detection:** "Trinetra" utilizes OpenCV for real-time object detection, providing users with vital information about their surroundings, including obstacles, landmarks, and objects in their path.

2. **Google Maps Integration:** Seamlessly integrating with Google Maps, "Trinetra" offers turn-by-turn directions, real-time location tracking, and invaluable insights about nearby landmarks and obstacles. Users can effortlessly input their destination and receive essential voice-guided navigation instructions.

3. **Voice Guidance:** Unlike conventional visual directions, "Trinetra" provides clear, real-time voice-guided navigation, enabling users to navigate confidently, enhancing both safety and independence.
4. **compatness**: because the code is to compact it can be developed into cloud based system making it even more accessible
**Benefits:**
- **Independence:** "Trinetra" bestows visually impaired individuals with newfound independence, reducing their reliance on external assistance.

- **Safety and Confidence:** Through real-time object detection and voice-guided navigation, "Trinetra" significantly enhances safety and instills users with the confidence to navigate their surroundings securely.

- **Inclusivity:** Guided by the principles of universal design, "Trinetra" ensures that advanced technology serves the needs of all, regardless of their visual abilities.

- **database:** A special databse can be created for the user. how this would help is that if any person meets a user frequently, that persons identiy can be stored in the database
so that the next time the person is recognised he/she/ze is recognised thorugh there name which would make the user experince much better as well as comfortable.

"Trinetra: An Eye for the Blind" is an essential and transformative tool, providing invaluable support to visually impaired individuals, ultimately enriching their lives by granting them the freedom to explore their world with confidence and independence.


tech stack:

1. we have used object detection tech for the blind so that it helps them navigate through there day to day life.
2. because it is built in python it is ectremly versatile allowing for great scope and scale for future developments.
3. thanks to google maps we can even put route reciting system in which it tells u the route like any other person would tell you in india but because it is not very much possible for the blind to do that this feture becomes a great help.
CODE EXECUTION:
> To execute our code you will first have to dowload the code file attached.(please excuse us if there are any mistakes since this is the first time using git)
> After the code is downloaded
> You will get the code in your machine
> Then you will need to download some libraries the pip command of which is given below
* pip install opencv-python
* pip install cvlib
* pip install pyttsx3
* pip install googlemaps
> After you are done installing the required libraries you are all set to go!
> Now all you have to do is just run the code, give it some time to intiate.
> After the code starts you will have to enter the initial and destination location, then it will recite the route and the image recognition will start after it.
> The explanation of each and every major function is given inside the code it self in the form of comments.

FUTURE SCOPE:

for a lightweight and compact code like "Trinetra: An Eye for the Blind," the future scope can focus on enhancements and broader applications within the constraints of its compact design. Here's the future scope:

**Future Scope for "Trinetra: An Eye for the Blind":**

1. **Optimization for Resource-Constrained Devices:** Further optimize the code to make it even more lightweight and efficient, ensuring compatibility with a wider range of low-power and resource-constrained devices. This expansion will make "Trinetra" accessible to a broader user base.

2. **Localization and Multilingual Support:** Introduce localization features to support users in various regions and languages. Expanding language options will enhance the inclusivity of the application.

3. **Enhanced Object Recognition:** Continuously improve object detection algorithms to provide more precise and detailed descriptions of the user's environment, thereby increasing the utility of the application.

4. **Voice Assistant Integration:** Explore integration with voice assistants like Siri, Google Assistant, or Alexa to make "Trinetra" even more user-friendly and hands-free, allowing users to interact with the application through voice commands.

5. **Community-Driven Development:** Encourage an active developer community to contribute to the project's development. Collaborative efforts can lead to innovative features and improvements.

6. **Navigation Beyond Streets:** Expand the scope of navigation beyond streets and sidewalks. Consider adding support for indoor navigation, public transportation, and other contexts that visually impaired individuals encounter in their daily lives.

7. **Education and Training:** Develop educational resources and training materials to help users maximize the benefits of "Trinetra." This includes video tutorials, user guides, and community support forums.

8. **Integration with Wearable Devices:** Explore integration with compact and lightweight wearable devices, such as smart glasses, to offer users a more seamless and hands-free navigation experience.

9. **Accessibility Standards Compliance:** Ensure that "Trinetra" adheres to the latest accessibility standards and regulations to guarantee that it remains a reliable and accessible tool for the visually impaired.

10. **User Customization:** Allow users to personalize settings, voice guidance, and preferences to tailor "Trinetra" to their individual needs and comfort.

11.**SOS & tracking:** We plan to develope a SOS and Tracking feature which allow the family members of the user to track the user in case of emergency which would ensure the
saftey of the user as well provide assuarnce to the family members.Also an SOS system will be developed where in case of emergency near by police station
and hospitals will be informed as well as an recorded call will be sent to the family members (this can be achived by using google maps api).

The future scope for "Trinetra: An Eye for the Blind" centers on making the lightweight and compact code even more versatile and user-friendly, expanding its functionality while maintaining its efficiency.
*We would rather concentate on it being more usefull rather then it being more impressive because at end of the day the code has to be usefull for the people it is made.*
**************************************POINTS TO BE NOTED DURING EVALUATION*************************************************
1. Since this a prototype model there are some limitations to it please understand those and keep them in mind.
=>because of googles privacy policy we are not allowed to track location as of now due to which the feacture of live location is currently not available,otherwise it was planned that the location feature was supposed to have live location so that the destination route
directly planned according to the users live location.
2. One other major limitation is that the location should be entered by typing and not by speaking that is because this is a prototype version and a prototype is supposed to be the most basic version.
3. While the code recites you the route it wont be able to do object detection.
4.The object detection on some objects is not very much accurate but that is because this a very basic model of our idea as well as a vison that nmany have thought of.
WE SINCERLY THANK EVEALUATORS FOR ALL THE CONSIDERATIONS.
84 changes: 84 additions & 0 deletions speakingtri.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import cv2
import cvlib
from cvlib.object_detection import draw_bbox
import pyttsx3
import googlemaps

# Initialize video capture, text-to-speech engine, and Google Maps client
video_capture = cv2.VideoCapture(0)
engine = pyttsx3.init()
voices = engine.getProperty('voices')
# we are taking avoice over here for the speech function
for voice in voices:
if "female" in voice.name.lower():
engine.setProperty('voice', voice.id)
#this is my google maps api key
gmaps = googlemaps.Client(key='AIzaSyDh3r-9NYFpvZDJ6H-W2W0jRmayAOtmnQI')

# Function for object detection
def detect_objects(frame):
bbox, label, conf = cvlib.detect_common_objects(frame, model='yolov3', confidence=0.5)
output_frame = draw_bbox(frame, bbox, label, conf)
return label, output_frame

# Function to speak detected objects
def speak_detected_objects(detected_objects):
engine.say("I see the following objects: " + ", ".join(detected_objects))
engine.runAndWait()

# Function to get directions
def get_directions(start, destination):
try:
directions = gmaps.directions(start, destination, mode="driving")
if directions:
return directions[0]['legs'][0]['steps']
except Exception as e:
print(f"Error getting directions: {e}")
return []

# Ask the user for starting and destination locations
current_location = input("Please specify your current location: ")
destination = input("Now, specify your destination: ")

# Get directions and speak the route
directions = get_directions(current_location, destination)
if directions:
for step in directions:
instruction = step['html_instructions']
engine.say(instruction)
engine.runAndWait()
else:
engine.say("Sorry, I couldn't find directions.")
engine.runAndWait()

# Maintain a list of currently detected objects
current_objects = []

while True:
ret, frame = video_capture.read()

if ret:
frame = cv2.resize(frame, (440, 480))
detected_objects, output_frame = detect_objects(frame)

# Compare currently detected objects with the new objects
new_objects = [obj for obj in detected_objects if obj not in current_objects]
objects_left = [obj for obj in current_objects if obj not in detected_objects]

if new_objects:
speak_detected_objects(new_objects)
elif objects_left:
engine.say("The following objects have left: " + ", ".join(objects_left))
engine.runAndWait()

current_objects = detected_objects

cv2.imshow('Object Detection', output_frame)

if cv2.waitKey(1) & 0xFF == ord('q'):
break

# Release resources and clean up when done
video_capture.release()
cv2.destroyAllWindows()
engine.stop()