The University of Illinois Urbana-Champaign (UIUC) is partnering with Amazon, Apple, Google, Meta, Microsoft, and nonprofit partners in the Speech Accessibility Project. The project's aim is to improve current speech recognition capabilities so that everyone, including people with disabilities and diverse speech patterns, can take advantage of speech recognition systems.
"The option to communicate and operate devices with speech is crucial for anyone interacting with technology or the digital economy today," said Mark Hasegawa-Johnson, the UIUC professor of electrical and computer engineering who's leading the project. "Speech interfaces should be available to everybody, and that includes people with disabilities."
Currently, many speech recognition systems, such as voice assistants and translation tools, fail to recognize a wide range of speech patterns, including those often associated with disabilities such as Lou Gehrig's disease or Amyotrophic lateral sclerosis, Parkinson's disease, cerebral palsy, and Down syndrome.
Therefore, oftentimes people who belong to these communities cannot take advantage of the benefits that speech recognition software can offer. However, through the use of artificial intelligence and machine learning, technology companies can make their voice recognition software more inclusive, and that is the goal of the Speech Accessibility Project.
Also: AI-powered speech recognition is entering a new phase: Total global comprehension
"This task has been difficult because it requires a lot of infrastructure, ideally the kind that can be supported by leading technology companies, so we've created a uniquely interdisciplinary team with expertise in linguistics, speech, AI, security, and privacy to help us meet this important challenge," said Hasegawa-Johnson.
The Speech Accessibility Project will collect speech samples from individuals representing a diversity of speech patterns by recruiting paid volunteers to contribute recorded samples. With the samples, the UIUC researchers will create a private, de-identified dataset that will be used to train machine-learning models to better understand a wider range of speech patterns. American English will be the focus of the project.