I am a Senior Machine Learning Engineer / Researcher currently working at Duolingo. I love languages, machine learning, and everything in between.
I speak Chinese, Japanese, and English fluently, and am learning Korean and Lojban. I helped launch the Japanese, Korean, and Chinese courses on Duolingo. My research projects recently appeared on TechCrunch and Quartz.
You can find my resume here.
- Feb. 2015 - Present: Senior Machine Learning Engineer / Researcher - Duolingo, Inc. (Pittsburgh, PA)
- Built automatic grading technologies for Duolingo English Test writing and speaking questions using neural networks
- Led data creation and analysis for various research projects, including user behavior analysis and second language acquisition modeling (SLAM) shared task
- Led the content creation of Chinese, Japanese, and Korean from English courses
- Oct. 2010 - Feb. 2015: Lead Scientist - Rakuten Institute of Technology (New York, NY)
- Developed machine transliteration (NLP2011 paper award) and machine translation algorithms for the largest Japanese e-commerce website (Rakuten)
- Built a Chinese/Japanese word segmentation / morphological analyzer (RakutenMA)
- Developed a writing support system for English as a Second Language (ESL) learners
- Apr. 2009 - Sep. 2010: Research and Development Engineer - Baidu Japan, Inc. (Shanghai / Beijing / Tokyo)
- Apr. 2008 - Jul. 2008: Research Intern - Microsoft Research (Redmond, WA; Mentor: Hisami Suzuki)
- Built a state-of-the-art method for Japanese query alteration for spelling correction and spelling/transliteration normalization
- Implemented the system using Visual C#, SQL Server, and Ruby, with tens of gigabytes of query log, which was integrated into Microsoft Live Search
- Published a research paper on the query alteration algorithm at NAACL 2009 and at the 3rd NLP Symposium for Young Researchers (Outstanding Presentation Award)
- Aug. 2005 - Sep. 2005: Intern (Software Engineer), Google Inc. (Mountain View, CA; Mentors: Dekang Lin and Jun Wu)
- Improved Japanese query suggestion, which is currently used as the basis for the query suggestion shown at the top and bottom of the Google search result
- Ran knowledge extraction algorithms on the distributed computation infrastructure (MapReduce and the Google's large network clusters)
- Apr. 2006 - Mar. 2009: Ph.D., Information Engineering,
- Graduate School of Information Science, Nagoya University, Japan.
- Doctoral Thesis: "Modeling and Selection of Context for Better Synonym Acquisition"
- Apr. 2004 - Mar. 2006 : Master's Degree, Information Engineering,
- Graduate School of Information Science, Nagoya University, Japan
- Skipped a year in undergraduate due to the excellent academic performance. Overall GPA: 3.8
- Master's Thesis: "Utilization of Probabilistic Latent Semantics for Automatic Thesaurus Construction"
- Apr. 2001 - Mar. 2004 : Information Engineering Course, School of Engineering,
- Nagoya University, Japan. Computer Science GPA: 3.9
Awards & Professional Activities
- Co-organizer of the Workshop for Natural Language Processing Open Source Software (NLP-OSS), co-located at ACL 2018.
- Invited keynote at the Optimizing Human Learning workshop co-located with ITS 2018 (Montréal, Canada, June 2018).
- Invited talk at CUNY NLP Seminar (hosted by Prof. Heng Ji) Title: Word Segmentation and Transliteration in Chinese and Japanese, April 2013. slides
- 2011 Field Innovation Award from the Japanese Society for Artificial Intelligence: ANPI_NLP: Safety Information Confirmation Support using Natural Language Processing for The 2011 Tohoku Earthquake.
- Paper Award at NLP2011 “Latent Class Transliteration based on Source Language Origins” (the largest Japanese NLP academic conference)
- Best Paper Award at NLP2009 “Semantic Category Extraction from Unsegmented Text using Graph Kernels” (the largest Japanese NLP academic conference, chosen among 235 papers)
- Paper Award at the 3rd NLP Symposium for Young Researchers. Presentation: “A Unified Approach to Japanese Query Alteration based on Semantic Similarity”
- Masato Hagiwara, Real-World Natural Language Processing, To be published by Manning Publications, 2019.
- Yoh Okuno, Graham Neubig, Masato Hagiwara, Mamoru Komachi. Natural Language Processing: Basics and Technology (Shoeisha) (in Japanese). Shoeisha, 2016.
- Drew Conway, John Myles White, 萩原正人 (Masato Hagiwara), 奥野 陽 (Yoh Okuno), 水野 貴明 (Takaaki Mizuno), 木下 哲也 (Tetsuya Kinoshita) (translation). 入門 機械学習 (Machine Learning for Hackers). O'Reilly Japan, 2012. O'Reilly Japan - 入門 機械学習
- Steven Bird, Ewan Klein, Edward Loper. 萩原正人 (Masato Hagiwara), 中山敬広 (Takahiro Nakayama), 水野貴明(Takaaki Mizuno) (translation). 入門 自然言語処理 (Natural Language Processing with Python). O'Reilly Japan, 2010. O'Reilly Japan - 入門 自然言語処理
- Masato Hagiwara, Koji Murakami, Graham Neubig, Yuichiroh Matsubayashi: Robust NLP for Real-world Data : 7. ANPI_NLP - Mining Safety Information after Disasters Using Natural Language Processing-. Information Processing Society of Japan Magazine. Vol. 53, No. 3, pp. 241-248, 2012.
- 萩原正人，小川泰弘，外山勝彦: グラフカーネルを用いた非分かち書き文からの漸次的語彙知識獲得, 人工知能学会誌, Vol.26, No.3, pp.440-450, 2011.
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Supervised Synonym Acquisition Using Distributional Features and Syntactic Patterns. Journal of Natural Language Processing, Vol. 16, Num. 2, pp. 59-83, 2009.
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. A Comparative Study on Effective Context Selection for Distributional Similarity. Journal of Natural Language Processing, Vol. 5, Num. 5, pp. 119-150, 2008.
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effective Use of Indirect Dependency for Distributional Similarity. Journal of Natural Language Processing, Vol. 15, Num. 4, pp. 19-42, 2008.
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Bootstrapping-based Extraction of Dictionary Terms from Unsegmented Legal Text. New Frontiers in Artificial Intelligence: JSAI 2008 Conference and Workshops, Revised Selected papers, Lecture Notes in Computer Science, Vol. 5447, pp. 213-227, 2009.
Conference Papers (Selected)
- Burr Settles, Chris Brust, Erin Gustafson, Masato Hagiwara, Nitin Madnani. Second Language Acquisition Modeling. BEA 2018, 2018. [paper]
- Ayah Zirikly, Masato Hagiwara. Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora. ACL 2015, pp. 390-396, 2015. [paper]
- Masato Hagiwara, Satoshi Sekine. Lightweight Client-Side Chinese/Japanese Morphological Analyzer Based on Online Learning. COLING 2014 system demonstration, pp. 39-43, 2014. [paper]
- Haibo Li, Masato Hagiwara, Qi Li, Heng Ji. Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese, LREC 2014, pp.2532-2536, 2014. [paper]
- Masato Hagiwara, Satoshi Sekine. Accurate Word Segmentation using Transliteration and Language Model Projection, ACL 2013, pp 183-189. [paper]
- Masato Hagiwara, Soh Masuko. KooSHO: Japanese Text Input Environment based on Aerial Hand Writing. The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2013), demo session, pp. 24-27. 2013. [paper]
- Yuta Hayashibe, Masato Hagiwara, Satoshi Sekine. phloat : Integrated Writing Environment for ESL learners, Second Workshop on Advances in Text Input Methods (WTIM 2012), pp.57-72, 2012. [paper] [slides]
- Masato Hagiwara, Satoshi Sekine. Latent Semantic Transliteration using Dirichlet Mixture. NEWS 2012 (the 4th Named Entities Workshop), pp. 30-37, 2012. [paper]
- Graham Neubig, Yuichiroh Matsubayashi, Masato Hagiwara, Koji Murakami. Safety Information Mining — What can NLP do in a disaster —, Proc. of IJCNLP 2011. [paper]
- Masato Hagiwara and Satoshi Sekine. Latent Class Transliteration based on Source Language Origins. Proc. of ACL-HLT 2011, pp. 53-57, 2011. [paper]
- Masato Hagiwara and Hisami Suzuki. Japanese Query Alteration Based on Lexical Semantic Similarity. Proc. of NAACL HLT 2009, pp. 191-199, 2009. [paper]
- Nobuyuki Shimizu, Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama and Hiroshi Nakagawa. Metric learning for synonym acquisition. Proc. of COLING 2008, pp. 793-800, 2008. [paper]
- Masato Hagiwara. A Supervised Learning Approach to Automatic Synonym Identification based on Distributional Features. Proc. of ACL 2008 Student Research Workshop, pp. 1-6, 2008. [paper] [link]
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Context Feature Selection for Distributional Similarity. Proc. of IJCNLP 2008, pp. 553-560, 2008. [paper] [link]
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effective Proximity Distance for Word-Based Context. Proc. of SNLP 2007, pp. 105-110, 2007. [paper] [link]
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effectiveness of Indirect Dependency for Automatic Synonym Acquisition. Proc. of CoSMo 2007, pp. 1 - 8, 2007. [paper] [ppt]
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Selection of Effective Contextual Information for Automatic Synonym Acquisition. Proc. of COLING/ACL 2006, pp. 353 - 360, 2006. [paper] [link]
- Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. PLSI Utilization for Automatic Thesaurus Construction. Proc. of IJCNLP 2005, pp. 334 - 345, 2005. [paper]
- In English
- In Japanese
- How I work - Masato Hagiwara at Duolingo (Jan. 2016, Lifehacker.jp)
- Why you shouldn't study at weekends - Data reveal three common traits of successful language learners (Dec. 2016, TechCrunch Japan)
- Difference between successful and unsuccessful language learners, according to a researcher at Duolingo (Dec. 2016, Lifehacker.jp)
- Aptitude doesn't matter for language learning - Interview with Masato Hagiwara, a Japanese software engineer at Duolingo (Aug. 2015, Lifehacker.jp)
- Humans still learning languages in 30 years? (Aug. 2015, Lifehacker.jp)
- Free language learning app Duolingo raises $45 million from Google Capital (June 2015, Nikkei Computer)
- Process Emojis as 'words' - Emojis not used as defined (July 2010, INTERNET Watch)
- Process Emojis as 'words' - algorithm to distinguish 'beers' from 'parties' (July 2010, INTERNET Watch)
- Character encoding experts turn Baidu Emoji search episodes into an academic paper (Mar. 2010, INTERNET Watch)