News & Stories

Students from HIT created a robot that imitates the voice of Shaul Amsterdamski

The text-to-text speech system developed for the podcast “Pocket Animals” earned third-year Computer Science undergrads Maxim Malikov and Tony Hasson, a prize from the Public Broadcasting Corporation. 


 Maxim Malkov and Tony Hasson
 Maxim Malkov and Tony Hasson


It all started 3 months ago when the popular Podcast published an ad challenging anyone to simulate the voice of the presenter and journalist, Shaul Amsterdamski.

Computer sciences student Tony Hasson, who also participates in the "Voice Processing for Intelligent Systems" course, saw the ad on Instagram and decided to contact his classmate, Maxim Malkov, who is equally interested in artificial intelligence. Together, they entered the Public Broadcasting Corporation competition.

"Voice and speech processing is a sub-field in artificial intelligence. We offer an advanced course that provides a theoretical understanding and practical application of computer-processing of human speech signals. This includes ASR speech recognition, TTS speech synthesis, biometric speaker identification, speech emotion recognition, speech-language analysis, and more," says Dr. Nava Shaked, Head of the School of Multidisciplinary Studies. She adds, "In recent years, the field is integral to every human-machine interface, and includes such applications as IoT products, wearable computing and social robots, ultimately morphing into the new and developing specialization called Conversational Interaction. Me and Anje Yuri Yevechenko, the course lecturer, both have a lot of experience in implementing industry projects of this type and we pass this knowledge on to the students."


As part of the "Voice Processing" course, Hasson and Malikov already created a McOrder bot, a user-friendly web application that uses speech recognition technology to easily place orders at a McDonald's Drive-Thru in a fully automated manner. They have even conducted actual trials to meet activation and response speed goals.

Tony and Maxim say that the road to "Robo-Shaul" was not easy at all because the podcast is in Hebrew, creating a challenge when it comes to such advanced technologies.

"An existing system that will process English text into voice (speech signal) is nothing new. We all remember the fake videos in the voice of former US presidents, Barack Obama and Donald Trump," Hasson says. "The technology did not exist in Hebrew, however, because each language needs to develop its own model. In English, for example, it is more difficult to pronounce the sound “H” and in the Hebrew language there is scoring and other differences that pose a challenge."

The two identified two options for the work process. One was to write a completely new system or alternately, to take a similar model that works in English and make the appropriate adaptation for Hebrew. "In the end, we found a similar English model, but it was very outdated, and we had to modernize it and make it work not only in Hebrew language, but also in Shaul's specific voice " says Malikov. "This is actually the first time that either of us did something like this. We worked on it for about a month and spent many hours researching and testing the issue and models related to "Deep Fake".

Finally, they managed to build a system that converts an English text into a Hebrew text and then turned it into sound depicting the voice of Amsterdamski, the presenter. 

The two were invited to showcase the end results as guests in an episode of the podcast "Pocket Animals", this time with the presenter himself. "I didn't believe it would finally happen," says Tony enthusiastically. "I was elated and surprised by the results of the first experiment we did with Shaul but then I'm a big believer that technology will improve people's lives, so I guess it can even get better."

Malikov and Hasson are in the last year of their studies, and it already seems that a bright future awaits them in the field of AI. Hasson is already working at the NCR company in Israel, which is a branch of the American hardware and software giant that produces self-service cash registers. Malikov is still looking for the right job, but now, the company that recruits him will benefit from the huge experience he gained through this successful experiment.