Augmented reality deaf assistance with real-time speech-to-text and text-to-speech translation using raspberry pi-based smart glasses and a mobile application/

Cabello, Marvin T.

Augmented reality deaf assistance with real-time speech-to-text and text-to-speech translation using raspberry pi-based smart glasses and a mobile application/ Marvin T. Cabello, Aeron Joshua F. Dela Cruz, Jhon Jaree F. Genavia, Nathan Miguel G. Icaro, and Dale Joshua S. Ricardo.-- - Manila: Technological University of the Philippines, 2025. - viii, 124pages: 29cm.

Bachelor's thesis

College of Engineering.--

Includes bibliographic references and index.

Humans are socially dependent beings. They rely on each other for survival, emotional
support, and personal as well as collective growth. To form and nurture these connections,
we developed communication – particularly through verbal sounds. The modern world is
flooded with different voices from radios, televisions, mobile phones, and other individuals.
These voices are necessary tools for learning. However, it is important to acknowledge that
not everyone experiences sound in the same manner. Hard of hearing or Deaf individuals
perceive communication differently through sign language and lip reading, and with less
than 20% of the world population experiencing hearing loss, it is inevitable that there is a
gap in communication for hard of hearing individuals and individuals with typical hearing.
This study aims to bridge that communication gap by developing an Augmented Reality
Glasses with Real-Time Speech-to-Text Translation. The wearable glasses contain a Mini
USB Microphone to take the vocal inputs, a Raspberry Pi Zero 2 W microcomputer that
uses Google's Speech-to-Text API to process input, an FLCOS micro display that outputs
the text and a magnifying glass that enlarges the text to a readable size without obstructing
vision. Findings so far have enabled Deaf or hard of hearing individuals to converse in live
conversations at an average of <2 seconds transcription delay per word in mostly controlled
environments. With less than 20% Word Error Rate (WER) as expected in Google's own
API, the system relied on clear and controlled speech to be transcribed properly.
Limitations were primarily due to hardware constraints, the Raspberry Pi Zero 2 W's
512MB RAM was a significant performance bottleneck and lack of direct support for
higher end input and output devices lowered the performance ceiling. However, this only
highlighted the device's pathways for future improvements as technology advances.


Augmented reality
Deaf assistance
Speech translation

BTH TK 870 / C33 2025



© 2025 Technological University of the Philippines.
All Rights Reserved.

Powered by Koha