Augmented reality deaf assistance with real-time speech-to-text and text-to-speech translation using raspberry pi-based smart glasses and a mobile application/ Marvin T. Cabello, Aeron Joshua F. Dela Cruz, Jhon Jaree F. Genavia, Nathan Miguel G. Icaro, and Dale Joshua S. Ricardo.--

By:

Cabello, Marvin T [author]

Contributor(s):

Material type: Text

TextPublication details: Manila: Technological University of the Philippines, 2025.Description: viii, 124pages: 29cmContent type:

Media type:

Carrier type:

Subject(s):

LOC classification:

BTH TK 870 C33 2025

Dissertation note: College of Engineering.-- Bachelor of science in electronics engineering: Technological University of the Philippines, 2025. Summary: Humans are socially dependent beings. They rely on each other for survival, emotional support, and personal as well as collective growth. To form and nurture these connections, we developed communication – particularly through verbal sounds. The modern world is flooded with different voices from radios, televisions, mobile phones, and other individuals. These voices are necessary tools for learning. However, it is important to acknowledge that not everyone experiences sound in the same manner. Hard of hearing or Deaf individuals perceive communication differently through sign language and lip reading, and with less than 20% of the world population experiencing hearing loss, it is inevitable that there is a gap in communication for hard of hearing individuals and individuals with typical hearing. This study aims to bridge that communication gap by developing an Augmented Reality Glasses with Real-Time Speech-to-Text Translation. The wearable glasses contain a Mini USB Microphone to take the vocal inputs, a Raspberry Pi Zero 2 W microcomputer that uses Google's Speech-to-Text API to process input, an FLCOS micro display that outputs the text and a magnifying glass that enlarges the text to a readable size without obstructing vision. Findings so far have enabled Deaf or hard of hearing individuals to converse in live conversations at an average of <2 seconds transcription delay per word in mostly controlled environments. With less than 20% Word Error Rate (WER) as expected in Google's own API, the system relied on clear and controlled speech to be transcribed properly. Limitations were primarily due to hardware constraints, the Raspberry Pi Zero 2 W's 512MB RAM was a significant performance bottleneck and lack of direct support for higher end input and output devices lowered the performance ceiling. However, this only highlighted the device's pathways for future improvements as technology advances.

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Copy number	Status	Date due	Barcode
Bachelor's Thesis COE	TUP Manila Library	BTH TK 870 C33 2025 (Browse shelf(Opens below))	c.1.	Not for loan		BTH0006464

Bachelor's thesis

College of Engineering.-- Bachelor of science in electronics engineering: Technological University of the Philippines, 2025.

Includes bibliographic references and index.

Humans are socially dependent beings. They rely on each other for survival, emotional
support, and personal as well as collective growth. To form and nurture these connections,
we developed communication – particularly through verbal sounds. The modern world is
flooded with different voices from radios, televisions, mobile phones, and other individuals.
These voices are necessary tools for learning. However, it is important to acknowledge that
not everyone experiences sound in the same manner. Hard of hearing or Deaf individuals
perceive communication differently through sign language and lip reading, and with less
than 20% of the world population experiencing hearing loss, it is inevitable that there is a
gap in communication for hard of hearing individuals and individuals with typical hearing.
This study aims to bridge that communication gap by developing an Augmented Reality
Glasses with Real-Time Speech-to-Text Translation. The wearable glasses contain a Mini
USB Microphone to take the vocal inputs, a Raspberry Pi Zero 2 W microcomputer that
uses Google's Speech-to-Text API to process input, an FLCOS micro display that outputs
the text and a magnifying glass that enlarges the text to a readable size without obstructing
vision. Findings so far have enabled Deaf or hard of hearing individuals to converse in live
conversations at an average of <2 seconds transcription delay per word in mostly controlled
environments. With less than 20% Word Error Rate (WER) as expected in Google's own
API, the system relied on clear and controlled speech to be transcribed properly.
Limitations were primarily due to hardware constraints, the Raspberry Pi Zero 2 W's
512MB RAM was a significant performance bottleneck and lack of direct support for
higher end input and output devices lowered the performance ceiling. However, this only
highlighted the device's pathways for future improvements as technology advances.

There are no comments on this title.

to post a comment.

Click on an image to view it in the image viewer