Masato Hagiwara

I am the Owner and an independent NLP/ML Engineer at Octanove Labs, my one-person consulting/freelancing business. Formerly, I was a Machine Learning Engineer / Researcher at Duolingo. I love language and machine learning, and help people connect the two. Author of the Manning book Real-World Natural Language Processing.

I speak Chinese, Japanese, and English fluently, and am learning Korean and Lojban. I helped launch the Japanese, Korean, and Chinese courses on Duolingo. My research projects recently appeared on TechCrunch and Quartz.

You can find my resume here.




  • Feb. 2019 - Present: Owner & Independent NLP/ML Engineer and Consultant - Octanove Labs LLC (Seattle, WA)
    • Work as a consultant for early-to-mid stage startups in the US/Japan on their ML strategies
    • Work on research projects e.g., educational NLP applications with RIKEN AIP and chatbot/NER with Stduio Ousia
    • Proposed protocol & framework for integrated writing environments (TEASPN)

  • Feb. 2015 - Feb. 2019: Senior Machine Learning Engineer / Researcher - Duolingo, Inc. (Pittsburgh, PA)
    • Built automatic grading technologies for Duolingo English Test writing and speaking questions using neural networks
    • Led data creation and analysis for various research projects, including user behavior analysis and second language acquisition modeling (SLAM) shared task
    • Led the content creation of Chinese, Japanese, and Korean from English courses

  • Oct. 2010 - Feb. 2015: Lead Scientist - Rakuten Institute of Technology (New York, NY)
    • Developed machine transliteration (NLP2011 paper award) and machine translation algorithms for the largest Japanese e-commerce website (Rakuten)
    • Built a Chinese/Japanese word segmentation / morphological analyzer (RakutenMA)
    • Developed a writing support system for English as a Second Language (ESL) learners

  • Apr. 2008 - Jul. 2008: Research Intern - Microsoft Research (Redmond, WA; Mentor: Hisami Suzuki)
    • Built a state-of-the-art method for Japanese query alteration for spelling correction and spelling/transliteration normalization
    • Implemented the system using Visual C#, SQL Server, and Ruby, with tens of gigabytes of query log, which was integrated into Microsoft Live Search
    • Published a research paper on the query alteration algorithm at NAACL 2009 and at the 3rd NLP Symposium for Young Researchers (Outstanding Presentation Award)

  • Aug. 2005 - Sep. 2005: Intern (Software Engineer), Google Inc. (Mountain View, CA; Mentors: Dekang Lin and Jun Wu)
    • Improved Japanese query suggestion, which is currently used as the basis for the query suggestion shown at the top and bottom of the Google search result
    • Ran knowledge extraction algorithms on the distributed computation infrastructure (MapReduce and the Google's large network clusters)


  • Apr. 2006 - Mar. 2009: Ph.D., Information Engineering,
    • Graduate School of Information Science, Nagoya University, Japan.
    • Doctoral Thesis: "Modeling and Selection of Context for Better Synonym Acquisition"

  • Apr. 2004 - Mar. 2006 : Master's Degree, Information Engineering,
    • Graduate School of Information Science, Nagoya University, Japan
    • Skipped a year in undergraduate due to the excellent academic performance. Overall GPA: 3.8
    • Master's Thesis: "Utilization of Probabilistic Latent Semantics for Automatic Thesaurus Construction"

  • Apr. 2001 - Mar. 2004 : Information Engineering Course, School of Engineering,
    • Nagoya University, Japan. Computer Science GPA: 3.9

Awards & Professional Activities

  • Invited talk on “Education and AllenNLP” at AllenNLP Summit, 2019.
  • Co-organizer of the Workshop for Natural Language Processing Open Source Software (NLP-OSS), co-located at ACL 2018.
  • Invited keynote at the Optimizing Human Learning workshop co-located with ITS 2018 (Montréal, Canada, June 2018).
  • Invited talk at CUNY NLP Seminar (hosted by Prof. Heng Ji) Title: Word Segmentation and Transliteration in Chinese and Japanese, April 2013. slides
  • 2011 Field Innovation Award from the Japanese Society for Artificial Intelligence: ANPI_NLP: Safety Information Confirmation Support using Natural Language Processing for The 2011 Tohoku Earthquake.
  • Paper Award at NLP2011 “Latent Class Transliteration based on Source Language Origins” (the largest Japanese NLP academic conference)
  • Best Paper Award at NLP2009 “Semantic Category Extraction from Unsegmented Text using Graph Kernels” (the largest Japanese NLP academic conference, chosen among 235 papers)
  • Paper Award at the 3rd NLP Symposium for Young Researchers. Presentation: “A Unified Approach to Japanese Query Alteration based on Semantic Similarity”



Journal Papers

  • Masato Hagiwara, Koji Murakami, Graham Neubig, Yuichiroh Matsubayashi: Robust NLP for Real-world Data : 7. ANPI_NLP - Mining Safety Information after Disasters Using Natural Language Processing-. Information Processing Society of Japan Magazine. Vol. 53, No. 3, pp. 241-248, 2012.
  • 萩原正人,小川泰弘,外山勝彦: グラフカーネルを用いた非分かち書き文からの漸次的語彙知識獲得, 人工知能学会誌, Vol.26, No.3, pp.440-450, 2011.
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Supervised Synonym Acquisition Using Distributional Features and Syntactic Patterns. Journal of Natural Language Processing, Vol. 16, Num. 2, pp. 59-83, 2009.
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. A Comparative Study on Effective Context Selection for Distributional Similarity. Journal of Natural Language Processing, Vol. 5, Num. 5, pp. 119-150, 2008.
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effective Use of Indirect Dependency for Distributional Similarity. Journal of Natural Language Processing, Vol. 15, Num. 4, pp. 19-42, 2008.
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Bootstrapping-based Extraction of Dictionary Terms from Unsegmented Legal Text. New Frontiers in Artificial Intelligence: JSAI 2008 Conference and Workshops, Revised Selected papers, Lecture Notes in Computer Science, Vol. 5447, pp. 213-227, 2009.

Conference Papers (Selected)

  • Takumi Ito, Tatsuki Kuribayashi, Hayato Kobayashi, Ana Brassard, Masato Hagiwara, Jun Suzuki and Kentaro Inui. Diamonds in the Rough: Generating Fluent Sentences from Early-stage Drafts. ILNG 2019 (to appear).
  • Masato Hagiwara, Takumi Ito, Tatsuki Kuribayashi, Jun Suzuki and Kentaro Inui. TEASPN: Framework and Protocol for Integrated Writing Assistance Environments. EMNLP (system demonstrations), 2019. [paper]
  • Burr Settles, Chris Brust, Erin Gustafson, Masato Hagiwara, Nitin Madnani. Second Language Acquisition Modeling. BEA 2018, 2018. [paper]
  • Ayah Zirikly, Masato Hagiwara. Cross-lingual Transfer of Named Entity Recognizers without Parallel Corpora. ACL 2015, pp. 390-396, 2015. [paper]
  • Masato Hagiwara, Satoshi Sekine. Lightweight Client-Side Chinese/Japanese Morphological Analyzer Based on Online Learning. COLING 2014 system demonstration, pp. 39-43, 2014. [paper]
  • Haibo Li, Masato Hagiwara, Qi Li, Heng Ji. Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese, LREC 2014, pp.2532-2536, 2014. [paper]
  • Masato Hagiwara, Satoshi Sekine. Accurate Word Segmentation using Transliteration and Language Model Projection, ACL 2013, pp 183-189. [paper]
  • Masato Hagiwara, Soh Masuko. KooSHO: Japanese Text Input Environment based on Aerial Hand Writing. The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL/HLT 2013), demo session, pp. 24-27. 2013. [paper]
  • Yuta Hayashibe, Masato Hagiwara, Satoshi Sekine. phloat : Integrated Writing Environment for ESL learners, Second Workshop on Advances in Text Input Methods (WTIM 2012), pp.57-72, 2012. [paper] [slides]
  • Masato Hagiwara, Satoshi Sekine. Latent Semantic Transliteration using Dirichlet Mixture. NEWS 2012 (the 4th Named Entities Workshop), pp. 30-37, 2012. [paper]
  • Graham Neubig, Yuichiroh Matsubayashi, Masato Hagiwara, Koji Murakami. Safety Information Mining — What can NLP do in a disaster —, Proc. of IJCNLP 2011. [paper]
  • Masato Hagiwara and Satoshi Sekine. Latent Class Transliteration based on Source Language Origins. Proc. of ACL-HLT 2011, pp. 53-57, 2011. [paper]
  • Masato Hagiwara and Hisami Suzuki. Japanese Query Alteration Based on Lexical Semantic Similarity. Proc. of NAACL HLT 2009, pp. 191-199, 2009. [paper]
  • Nobuyuki Shimizu, Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama and Hiroshi Nakagawa. Metric learning for synonym acquisition. Proc. of COLING 2008, pp. 793-800, 2008. [paper]
  • Masato Hagiwara. A Supervised Learning Approach to Automatic Synonym Identification based on Distributional Features. Proc. of ACL 2008 Student Research Workshop, pp. 1-6, 2008. [paper] [link]
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Context Feature Selection for Distributional Similarity. Proc. of IJCNLP 2008, pp. 553-560, 2008. [paper] [link]
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effective Proximity Distance for Word-Based Context. Proc. of SNLP 2007, pp. 105-110, 2007. [paper] [link]
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Effectiveness of Indirect Dependency for Automatic Synonym Acquisition. Proc. of CoSMo 2007, pp. 1 - 8, 2007. [paper] [ppt]
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. Selection of Effective Contextual Information for Automatic Synonym Acquisition. Proc. of COLING/ACL 2006, pp. 353 - 360, 2006. [paper] [link]
  • Masato Hagiwara, Yasuhiro Ogawa, Katsuhiko Toyama. PLSI Utilization for Automatic Thesaurus Construction. Proc. of IJCNLP 2005, pp. 334 - 345, 2005. [paper]



In English

In Japanese