Optimizing Human Language Learning

This is the companion website for Masato Hagiwara's keynote talk "Optimizing Human Language Learning"

At Optimizing Human Learning Workshop colocated with the ITS2018 conference

June 12, 2018, Montréal, Canada

(This companion website was inspired by a blog post by Prof. Charles Suttton)

TL;CE (too long; checked email)

  • Language learning and humankind
    • I tried to learn Esperanto from textbooks 15 years ago
    • The Esperanto community is now growing and thriving thanks to online technologies
    • We as humankind are getting better at learning languages, mainly thanks to technologies
    • There are 1.2 billion people worldwide who are learning a second language. Optimizing human language learning has a truly huge impact on those people's lives
    • The most popular language learned on Duolingo in Sweden is Swedish. Who are the learners? Refugees. They are learning Swedish from Arabic
  • Optimizing human
    • Your total learning = time x learning efficacy. This "time" factor is often overlooked
    • We wanted to find out what separates those who stay on our app and those who quit
    • There are a few statistically significant factors for predicting user retention. Frequency, number, and consistency of lessons stood out
    • We ran user clustering to find out different user behavioral patterns based on time of day, and found clear behavioral clusters (e.g., 9-to-5, prime time, weekend, etc.)
    • The "before bed" cluster was the best in terms of learning performance. This result is consistent with many research studies (e.g., Mezza et al. 2016) on the effect of sleep
  • Optimizing language
    • There are very few educational datasets that are 1) about language learning, 2) multilingual, and 3) longitudinal
    • We released the SLAM (second language acquisition model) dataset, which provides Duolingo users' long-term language learning traces and token-level error attribution
    • We ran the 2018 Duolingo SLAM shared task. 15 teams participated
    • For this particular dataset, the choice of algorithms (e.g., tree ensembles, RNNs, multitask approach) appears to be more important than feature engineering
  • Optimizing learning
    • The two sigma problem - tutored students perform two standard deviations (sigmas) better than the ones in a traditional classroom setting
    • "Progress bar" keeps track of your mastery of the concepts taught in a lesson
    • The easiest way is to approximate this is to use the population learning curve
    • The population curve is misleading - Streeter (2015) proposed a framework for modeling a mixture of arbitrary learning curves
    • "Strength bar" keeps track of how much you still remember the concept taught in a lesson
    • The older version of Duolingo was using the Leitner system, which is a simple system to estimate the halflife based on the numbers of corrects and incorrects
    • Halflife regression (Settles 2015) estimates the halflife based on an arbitrary set of features, leading to a 1.7% increase in next-day retention


Slides as PDF

Other Resources


And we're hiring!

Duolingo is hiring research scientist, language assessment expert, and machine learning engineering positions, among others.

Visit our jobs page if you are interested!