This post was written by Runa Bhattacharjee and Pau Giner. It originally appeared on the Wikimedia Blog and is republished under a CC BY-SA 3.0 license.
Studies estimate that there are more than 7,000 languages spoken around the world. Wikipedia exists in about 300 of them. That’s about 4 percent of some of the world’s languages documenting some of the world’s knowledge.
Consider the Arabic language. With more than 420 million speakers, it’s one of the most widely spoken languages in the world. Yet, only 3 percent of internet content today is available in Arabic. Or consider Zulu, with more than 12 million speakers—but only about 1,100 Wikipedia articles.
In the Wikimedia vision lies a core promise to everyone who uses our sites—all the world’s knowledge, for free, and in your own language. We have a long way to go to achieving that vision, but we’re excited about the expansion of a tool we already know has been successful in helping us get there.
Our content translation tool has been used to translate nearly 400,000 articles on Wikipedia. We leverage machine translation to support editors by producing an initial translation of an article they can then review, edit, and improve. On January 9, 2019, we’re excited to announce that Google Translate, one of the most advanced machine translation systems available today, will now be available for editors to utilize when translating articles through the content translation tool.
How it works
Integrating Google Translate into the content translation tool on Wikipedia has been long-requested by volunteer editor communities. Editors can select from several machine translation systems to support an initial article translation, Google Translate now being one of these options. By introducing Google Translate as one of the machine translation systems, the content translation tool can now support an additional 15 languages, including Hausa, Kurdish (Kurmanji), Yoruba, and Zulu. Today, the content translation tool can facilitate translations in 121 total languages.
We’re excited to collaborate with Google on this new added functionality of the content translation tool. Translations will be published under a free license that allows content to be integrated back into Wikipedia in line with our own licensing policies. No personal data will be shared with Google or Wikimedia as part of Google Translate’s integration into the content translation tool.
Stay tuned for more updates on the content translation tool and the Wikimedia Foundation’s Language team’s work to expand language support for all the world’s knowledge in all the world’s languages.