Network-based Multilingual Speech-to-Speech Translation Covering 21 Languages, Ready for the Use on Smartphones

June 29, 2010

The National Institute of Information and Communications Technology (President: Dr. Hideo Miyahara, hereinafter called as “NICT”) has been developing the multilingual speech and language processing technology in the framework of MASTAR Project, led by Dr. Satoshi Nakamura. Now we came to the stage to actualize the large-scale multilingual translation. You can input your speech in 6 languages (Japanese, English, Chinese, Vietnamese, Indonesian and Malay) and obtain translation results in 21 languages, out of which speech output is available in 6 languages (Japanese, English, Chinese, Vietnamese, Indonesian and Malay) while the other languages are shown in text. We are ready to provide this technology as a network-based service to be used on smartphones.

Background

Even in this borderless society, the language barrier is still a big challenge for borderless communication. To overcome this barrier, NICT has been working on the research and development on the speech-to-speech translation technology that enables real-time translation of a spoken dialog from one language to another.

Details

Fig.1Speech-to-speech translation with smartphones

NICT has so far developed the interactive and real-time speech-to-speech translation technology for travel conversation in Japanese, English and Chinese. We succeeded in expanding its speech processing portion to cover 6 languages and its translation portion to cover 21 languages with our improved technology using multilingual speech and text corpora. The speech output is now available in 6 languages, 3 more languages (Vietnamese, Indonesian and Malay) than before, while the text output is in 21 languages (see Appendix). Furthermore, this technology is ready to be provided as a network-based service for the use on smartphones. You can use it on iPhoneTM, a popular gadget with more than 50 million users, all over the world via 3G or WiFi network.

Future aspects

Based on our speech-to-speech translation technology with speech input (recognition) and speech output (synthesis) in the 21 languages, which covers 80% of the world’s population, we will continue our effort for more multilingualization. In the meantime, this technology is expected to be open to the public, in this fiscal year, for the use with smartphones. For that purpose, we will proceed with feasibility study in the real field, problem analysis, and technical improvement, aiming for the full-scale practical use.
※ Please refer to the past release entitled “MOBILE WIRELESS ROUTER CONFORMING TO

Appendix

Fig. 2 Speech-to-speech translation from Japanese into English (with back translation)
Left: Translation result 　　Right: Language selection menu

Glossary

Corpus

Database of large amount of sentences. For example, a collection of year’s worth newspaper articles. Some corpus is collected in a single language and other is in multiple languages.

List of covered languages

S: Speech output is available T: Text output only

	Speaker's language (Speech input)
Translated language	Japanese	English	Chinese (Simplified)	Indonesian	Vietnamese	Malay
Japanese		S	S	S	S	S
English	S		S	S	S	S
Chinese (Simplified)	S	S		S	S	S
Indonesian	S	S	S		S	S
Vietnamese	S	S	S	S		S
Malay	S	S	S	S	S
Korean	T	T	T	T	T	T
Chinese (Traditional)	T	T	T	T	T	T
Thai	T	T	T	T	T	T
Hindi	T	T	T	T	T	T
Arabic	T	T	T	T	T	T
Danish	T	T	T	T	T	T
Geman	T	T	T	T	T	T
Spanish	T	T	T	T	T	T
French	T	T	T	T	T	T
Italian	T	T	T	T	T	T
Dutch	T	T	T	T	T	T
Portuguese	T	T	T	T	T	T
Russian	T	T	T	T	T	T
Tagalog	T	T	T	T	T	T
Brazilian Prtuguese	T	T	T	T	T	T