An Internship To Help Save Endangered Languages

It’s common to hear about species becoming extinct or endangered. You’ve probably Advocacy organizations protecting nature frequently appear on television promoting their cause. But how often do you hear about the endangered languages?

It is estimated that one language goes extinct every three months. The problem of language preservation is very urgent.

That is a focus of PanLex, the nonprofit organization where I’m interning this summer as part of the Edmund S. Muskie Internship Program.

Panlex Edmund S. Muskie Internship Program
Alina poses at her desk at PanLex, where where she’s responsible for adding new translations into their database.

What does PanLex do?

PanLex supports multilingual translation. The project has built the world’s largest translation database. Currently it covers 2,500 dictionaries, 5,700 languages and 25,000,000 words. By transforming thousands of dictionaries into a single common structure, the PanLex database makes it possible to derive billions of lexical, or word-to-word, translations that are not found in any single dictionary.

In comparison with other machine translation apps like Google Translate, which translates whole sentences and texts in up to a hundred major world languages, PanLex translates individual words, but in thousands of languages.

PanLex Translation Database Muskie Internship
PanLex is the world’s largest translation database. It provides word-to-word translations in thousands of languages.

What have I been doing for the project?

My main focus is adding new lexical translations to the database. The process of “adding” data from a dictionary to our database might seem simple. Say, there’s a dictionary in paper format. Our partners digitize it and return it back to us. Now we have to extract particular information from the dictionary using coding and incorporate it into the database. Seems easy, right? Well, for me this is the most challenging part, since I don’t have a strong computer science background, and this has been a high learning curve!

International Multilingual User Group Silicon Valley Muskie Internship Intern
Alina with other language technology professionals who work in Silicon Valley.

Another focus of my internship is creating transliterators. Transliteration allows the reader to sound out a word from an unfamiliar language or script by seeing it transformed into a familiar script. My part in this process is to convert some languages’ scripts from Cyrillic or Arabic into the IPA (International Phonetic Alphabet). These transliterations will support a new feature of the PanLex Translator app. Say, you need to know what the word for “kidney” is in Urdu. You type “kidney” in the translator and get abracadabra like “گردہ”. All you know is that these characters mean “kidney” in Urdu. Here comes the transliterator, which will tell you that you should pronounce the word as “gurda”.

“Don’t forget, you’re in the Bay Area”

One fun part of my internship is the location! Our office is in Berkeley, California, which means that there’s always a lot of cool stuff going on. I was lucky to be introduced to IMUG (The International Multilingual User Group) meetings, which is a forum for language technology professionals happening monthly in Silicon Valley. The ones I attended were at Netflix and Facebook. Such meetings are a good opportunity for learning about the trends in the industry and, of course, for meeting like-minded people.

A woman speaks on a stage with a large screen behind her.

Because I’m eager to explore as much as I can in the Bay Area, I couldn’t miss fulfilling my long time dream of visiting the Google headquarters.

Living in Berkeley can also surprise you with meeting people whose names you’ve only seen in the scientific articles before. I live on the same street with the famous linguist Leonard Talmy, whose works I read as a student and I’ve never even thought of meeting him in person. But here I am, having coffee with him at the local cafe and talking about weirdness of languages.

See you later, California!

Due to this internship, I got a new set of skills, enriched my knowledge about the world’s languages and writing systems, expanded my network, and got an insight into the tech world of Silicon Valley. So, there’s a couple of things I’d like to say.

Google Mountain View California PanLex Muskie Intern Internship
While completing her Muskie internship at PanLex, Alina got to achieve her dream of visiting Google’s headquarters.

Be open to all opportunities, because you never know what waits for you around the corner. Be brave. Face your fears and try to overcome them. That’s how you grow.

And stay curious. Curiosity is the driving force of personal development.

Alina Korshunova
Latest posts by Alina Korshunova (see all)

Alina Korshunova

Alina's passion is linguistics. She received her bachelor's degree in Theoretical and Applied Linguistics at Pyatigorsk State University (Russia). As a Fulbright Scholar, she's getting her master's in English Linguistics at Eastern Michigan University (USA). She's currently interning in Berkeley, California.

View all posts by Alina Korshunova

Leave a Reply

Your email address will not be published. Required fields are marked *