Researchers propose automatic speech recognition technology

Popular voice assistants like Siri and Amazon Alexa have brought automatic voice recognition to the masses. Although decades in the making, ASR models struggle to be consistent and reliable, especially in noisy environments.

According to, Chinese researchers have developed a framework that effectively improves the performance of ASR for the chaos of everyday acoustic environments.

Researchers from Hong Kong University of Science and Technology and WeBank have proposed a new phonetic-semantic pre-training (PSP) framework and demonstrated the robustness of their new model against speech datasets. very noisy synthetics.

Their study was published in CAAI Artificial Intelligence Research on August 28.

“Robustness has been a long-standing challenge for ASR,” said Xueyang Wu from the Department of Computer Science and Engineering at Hong Kong University of Science and Technology. “We want to increase the robustness of the Chinese ASR system at a lower cost.”

ASR uses machine learning and other artificial intelligence techniques to automatically translate speech to text for purposes such as voice-activated systems and transcription software.

All rights reserved. This material and any other digital content on this website may not be reproduced, published, broadcast, rewritten or redistributed in whole or in part without the prior express written permission of PUNCH.

Contact: the editorial staff[at]

Source link

Comments are closed.