Home Community Google Announce the Open Source Release of Project Guideline: Revolutionizing Accessibility with On-Device Machine Learning for Independent Mobility

Google Announce the Open Source Release of Project Guideline: Revolutionizing Accessibility with On-Device Machine Learning for Independent Mobility

0
Google Announce the Open Source Release of Project Guideline: Revolutionizing Accessibility with On-Device Machine Learning for Independent Mobility

Researchers have undertaken the formidable task of enhancing the independence of people with visual impairments through the revolutionary Project Guideline. This initiative seeks to empower people who find themselves blind or have low vision by leveraging on-device machine learning (ML) on Google Pixel phones, enabling them to walk or run independently. The project revolves around a waist-mounted phone, a chosen guideline on a pedestrian pathway, and a complicated combination of audio cues and obstacle detection to guide users safely through the physical world.

Project Guideline emerges as a groundbreaking solution for computer vision accessibility technology. Departing from conventional methods that always involve external guides or guide animals, the project utilizes on-device ML tailored for Google Pixel phones. The researchers behind Project Guideline have devised a comprehensive method that employs ARCore for tracking the user’s position and orientation, a segmentation model based on DeepLabV3+ for detecting the rule of thumb, and a monocular depth ML model for identifying obstacles. This unique approach allows users to navigate outdoor paths marked with a painted line independently, marking a big advancement in assistive technology.

Delving into the intricacies of Project Guideline’s technology reveals a complicated system at work. The core platform is crafted using C++, seamlessly integrating essential libraries comparable to MediaPipe. ARCore, a fundamental component, estimates the user’s position and orientation as they traverse the designated path. Concurrently, a segmentation model processes each frame, generating a binary mask that outlines the rule of thumb. The aggregated points create a 2D map of the rule of thumb’s trajectory, ensuring a stateful representation of the user’s environment. 

The control system dynamically selects goal points on the road, providing a navigation signal that considers the user’s current position, velocity, and direction. This forward-thinking approach eliminates noise brought on by irregular camera movements during activities like running, offering a more reliable user experience. Including obstacle detection, facilitated by a depth model trained on a various dataset often known as SANPO, adds an additional layer of safety. The model is adept at discerning the depth of assorted obstacles, including people, vehicles, posts, and more. The depth maps are converted into 3D point clouds, much like the road segmentation process, forming a comprehensive understanding of the user’s surroundings. The whole system is complemented by a low-latency audio system, ensuring real-time delivery of audio cues to guide the user effectively.

https://blog.research.google/2023/11/open-sourcing-project-guideline.html

In conclusion, Project Guideline represents a transformative stride in computer vision accessibility. The researchers’ meticulous approach addresses the challenges faced by individuals with visual impairments, offering a holistic solution that mixes machine learning, augmented reality technology, and audio feedback. The choice to open-source the Project Guideline further emphasizes the commitment to inclusivity and innovation. This initiative not only enhances users’ autonomy but additionally sets a precedent for future advancements in assistive technology. As technology evolves, Project Guideline serves as a beacon, illuminating the trail toward a more accessible and inclusive future.


Take a look at the GitHub and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to affix our 33k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the most recent AI research news, cool AI projects, and more.

For those who like our work, you’ll love our newsletter..


Madhur Garg is a consulting intern at MarktechPost. He’s currently pursuing his B.Tech in Civil and Environmental Engineering from the Indian Institute of Technology (IIT), Patna. He shares a robust passion for Machine Learning and enjoys exploring the most recent advancements in technologies and their practical applications. With a keen interest in artificial intelligence and its diverse applications, Madhur is decided to contribute to the sphere of Data Science and leverage its potential impact in various industries.


↗ Step by Step Tutorial on ‘The best way to Construct LLM Apps that may See Hear Speak’

LEAVE A REPLY

Please enter your comment!
Please enter your name here