Meta doubles direct speech-to-speech translation

On June 13, 2022, Meta (fka Facebook) published an article about a “direct speech-to-speech (S2ST) approach”. Direct S2ST eliminates the text generation step in spoken language conversion, thus including languages ​​without a writing system.

Typically, S2ST requires speech recognition followed by text-to-text translation and, finally, text-to-speech conversion.

Meta’s multilingual textless S2ST methodology uses audio samples that are systematically processed in a type of training system that the company describes as “speech-to-speech extracted data.” It uses mega speech samples which include their own multilingual Meta AI FAIR S2ST and Vox Populi audio datasets.

The social media giant described the approach as the first S2ST framework “trained on real-world open-source audio data”. It is currently being tested using the Fisher Spanish-English Speech Translation Corpus from the University of Pennsylvania, an audio database of 139,000 phrases from Spanish telephone conversations.

Scientists involved in this and similar Meta projects say that, until now, S2ST systems have not been successfully trained with “real-world, publicly available data across multiple languages.”

The implications of this breakthrough are many, including language-independent connectivity between live-action platforms for business or leisure, while transforming the interpretation landscape much sooner than expected.

Meta-researchers expect their new research on speech-to-speech translation to make a difference in translation quality, language conversion speed, and improved communication for users.

In a sort of surreptitious app crowdsourcing, he has made all related articles and code freely available on the blog, stating his “hope to enable future direct advances in speech-to-speech translation in the research community.” “.

Whether in the hands of a lone developer, tech entrepreneur, or academic researcher, a scientific breakthrough of this nature has the potential to shorten the path to multilingual implementations within the “metaverse” and beyond. -of the.


Source link

Comments are closed.