Stanford alumni build app to change accent
A startup called Sanas is developing technology that aims to reduce communication problems by changing people’s focus in real time.
Stanford students heard the sadness in their friend’s voice when he broke the news.
“Guys, I had to quit my job.”
To them, it made no sense. He was fluent in English and Spanish, was extremely friendly and was an expert in systems engineering. Why couldn’t he get a job in a call center?
His accent, the friend said, made it difficult for many customers to understand him; some even hurled insults because of the way he spoke.
The three students realized that the problem was even greater than their friend’s experience. So they founded a startup to solve it.
Now their company, Sanas, is testing artificial intelligence-based software that aims to eliminate communication problems by changing people’s focus in real time. An employee at a call center in the Philippines, for example, might speak normally into the microphone and end up looking more like someone from Kansas to a customer on the other end of the phone.
Call centers, say the startup’s founders, are just the start. The company’s website lists its plans as “Speech, Reimagined”.
Ultimately, they hope that the application they develop will be used by a variety of industries and individuals. It could help doctors understand patients better, they say, or help grandchildren understand their grandparents better.
“We have a very grand vision for Sanas,” said CEO Maxim Serebryakov.
And for Serebryakov and his co-founders, the project is personal.
“People’s voices are not heard as much as their accents”
The trio that founded Sanas met at Stanford University, but they are all from different countries – Serebryakov, now CEO, is from Russia; Andrés Pérez Soderi, now CFO, is originally from Venezuela; and Shawn Zhang, now CTO, is from China.
They are no longer Stanford students. Serebryakov and Pérez are graduates; Zhang gave up to focus on Sanas’ life.
They launched the company last year and gave it a name that can be easily pronounced in multiple languages “to highlight our global mission and want to bring people together,” Pérez explains.
Over the years, all three say they have seen how much accents can get in the way.
“We all come from international backgrounds. We have seen with our own eyes how people treat you differently just because of the way you talk, ”Serebryakov says. “It’s heartbreaking sometimes.”
Zhang says his mother, who came to the United States from China over 20 years ago, always makes him talk to the cashier when they go shopping together because she is embarrassed.
“This is one of the reasons I joined Max and Andrés to start this business, trying to help people who think their voice isn’t heard as much as their accent,” he says.
Serebryakov says he has seen how his parents are treated in hotels when they come to visit him in the United States – how people make assumptions when they hear their accents.
“They speak a little louder. They change their behavior, ”he says.
Pérez says that after attending a British school, he first struggled with American accents when he arrived in the United States.
And don’t get him started on what happens when his dad tries to use the Amazon Alexa his family gave him for Christmas.
“We quickly found out, when Alexa turned on the lights in random places in the house and turned them pink, that Alexa didn’t understand my father’s accent at all,” Pérez explains.
Call centers test the technology
English is the most widely used language in the world. An estimated 1.5 billion people speak it – and most of them are not native speakers. In the United States alone, millions of people speak English as a second language.
This has created a growing market for apps that help users practice their English pronunciation. But Sanas is using AI to take a different approach.
The principle: rather than learning to pronounce words differently, technology could do it for you. There would no longer be a need for expensive or time-consuming training in stress reduction. And the understanding would be almost instantaneous.
Serebryakov says he knows that people’s accents and identities can be intertwined, and he stresses that the company doesn’t try to erase accents or suggest that a way of speaking is better than a way of speaking. other.
“We make it possible for people not to have to change the way they speak to get a job, to get a job. Identity and accents are essential. They are linked, ”he says. “You never want someone to change their accent just to please someone.”
Currently, Sanas’s algorithm can convert English to and from American, Australian, British, Filipino, Indian and Spanish accents, and the team plans to add more. They can add new emphasis to the system by forming a neural network with audio recordings of professional actors and other data – a process that takes several weeks.
The Sanas team performed two demos for CNN. In one, a man with an Indian accent is heard reading a series of literary phrases. Then these same sentences are converted to an American accent:
Another example features phrases that might be more common in a call center, such as “if you give me your full name and order number, we can fix it for you.”
The American-accented results seem somewhat contrived and stilted, like the voices of virtual assistants like Siri and Alexa, but Pérez says the team is working on improving the technology.
“The accent changes, but the intonation is maintained,” he says. “We continue to work on how to make the result as natural, emotional and exciting as possible.”
Early feedback from call centers that have tried the technology has been positive, Pérez says. So have comments submitted on their website as word spreads about their business.
And they say their plans for the company earned $ 5.5 million in seed funding from investors earlier this year.
How the founders of the startup see its future
This has enabled Sanas to develop its staff. Most of the employees at the Palo Alto, California-based company come from international backgrounds. And this is no coincidence, Serebryakov says.
“What we are building has resonated with so many people, even the people we hire. … It’s really exciting to watch, ”he said.
As the business grows, it may still take some time for Sanas to appear in an app store or on a cell phone near you.
The team says they are currently working with larger call center outsourcing companies and opting for a slower rollout for individual users so they can fine tune the technology and provide security.
But ultimately, they hope that Sanas will be used by all who need it – in other areas as well.
Pérez sees him playing an important role in helping people communicate with their doctors.
“Any second wasted because of a misunderstanding or wasted time or a wrong message is potentially very, very impactful,” he says. “We really want to make sure that there is nothing lost in the translation. “
Someday, he says, it could also help people learn languages, improve voice acting in movies, and help smart speakers in homes and voice assistants in cars understand different accents.
And not just in English – the Sanas team also hopes to add other languages to the algorithm.
The three co-founders are still working on the details. But how this technology could improve communication in the future, they say, is easy to understand.