Text-to-Speech Software Teaches How to Pronounce Words in Quechua on Social Media

Logo del proyecto, usado con autorización.

The logo for Hinantin, a project developing software that fosters the use of Quechua, an indigenous language used increasingly less. Used with permission.

A Peruvian research group has developed a text-to-voice software application that is able to transform Quechua phrases into speech, which they then broadcast through social networks such as YouTubeTwitter, and Facebook.

Cusco-based Hinantin, in Peru, specializes in research and software development in computing linguistics, and has been working to promote the usage of Peruvian indigenous languages, as the number of people actively speaking them is decreasing.

Global Voices spoke with Richard Castro, one of Hinantin's founding members, who explained that this is the first of several software projects related to the Quechua language, such as the Cusco Quechua text-to-voice converter that can automatically transform text into speech, the online spelling checker, and a plug-in for the LibreOffice text editing suite.

The text-to-speech converter project in particular takes a Quechua phrase or word of the day and, using an audio player, transforms text into sound and broadcasts the audio through social networks. The project's Instagram account also posts images with the word of the day and its meaning in different languages. The audio clips are also archived on SoundCloud, as seen here with the voice version of the word wayra, the Quechua for “wind”:

In this video on YouTube, Hinantin teaches its viewers how to pronounce the word maywa, the Quechua for “purple”:

The software was developed based on a corpus (a collection) of text and audio samples for Quechua. The project is focused on Cusco, Puno, and Lima Quechua, as the translators in charge of this stage of the process hail from those regions. People who lend their voices for the audios each use their own ways of speaking, although the writing system used to present the audio for videos and podcasts is that of Southern Quechua.

Each translator is able to use a writing system of their choice, as the texts introduced to the system are automatically adapted to Southern Quechua. Members of the project compare the translations with available automatic translators for Quechua to check for flaws and advantages.

One of the purposes of the corpus of text and audio samples is to use it in teaching the Quechua language through an electronic platform called RunaSimi. For this purpose, project contributors have created a special subgroup of the text and audio corpus.

The group behind Hinantin also contributes to the development of other indigenous language support programs related to Quechua, Asháninka, and Aymara languages.

Since indigenous languages are part of Peruvian cultural heritage, their assessment and use are inherent in the policies promoting diversity and intercultural exchange. According to current estimates, the country has over 50 living native languages, as well as many others that are already considered extinct. The Peruvian Ministry of Culture has compiled the Indigenous Languages National Document, with official figures about each language and its speakers. The document also contains information on where to learn these indigenous languages and a statistical map featuring audio of indigenous or native tongues.

Start the conversation

Authors, please log in »

Guidelines

  • All comments are reviewed by a moderator. Do not submit your comment more than once or it may be identified as spam.
  • Please treat others with respect. Comments containing hate speech, obscenity, and personal attacks will not be approved.