Khanh X. Nguyen

I am a Postdoctoral Research Fellow of the Center for Human-Compatible Artificial Intelligence (CHAI) at the University of California–Berkeley, where I am fortunate to be mentored by Prof. Stuart Russell. Previously, I was a postdoc at the Princeton NLP group working with Prof. Karthik Narasimhan. I obtained my PhD at the University of Maryland–College Park, advised by Prof. Hal Daumé III.

I am on the job market, looking for faculty or research scientist positions. My research statement summarizes my accomplishments and vision. Drop me an email if you are interested in my profile!

I create artificial agents that have the communication skills and incentives to assist humans. Specifically, I explore the following questions:

How to enable AI agents to learn from natural human feedback (listening skill): My [EMNLP’17] paper demonstrated for the first time the feasibility of using only noisy, complete-output ratings to improve the performance of a neural text generator. This work was followed by studies that used real human ratings at eBay and OpenAI, ultimately leading to the development of InstructGPT that popularized RLHF.
More recently, I have been developing frameworks for learning from language feedback with theoretical guarantees [ICML’21, ACL’24WS].
How to identify and share with humans what AI agents know and do not know (speaking skill): I was an early explorer of calibration analysis for NLP models [EMNLP’15’] and pioneered the development of robots that ask for help [CVPR’19, EMNLP’19, ICML’22]. Lately, I develop models that guide human navigation with language, improving their pragmatic reasoning capability [ACL’23] and making them useful even when they generate inaccurate instructions [EMNLP’24].
How to drive AI agents toward efficient and beneficial communicative behavior (incentive): I create agents that learn with progressive efficiency [NeurIPS’23WS], i.e. the more you talk to them, the less effort it will take to teach them. In an ongoing work, I characterize the limitations of the popular RLHF approach and propose a new alignment framework that emphasizes alignment with not only with the human principal but also with reality.

Some personal facts:

My real name is Nguyễn Xuân Khánh . My first name (Khánh) means “joy” or “happiness”. Please do not confuse it with Khan or Kahn :(
I was born in Việt Nam , a peaceful country (click here for inspiration to visit us).
I am also proud to be a PTNK (Phổ Thông Năng Khiếu) alumnus.

selected publications

Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections

Lingjun Zhao, Nguyen X. Khanh, and Hal Daumé III

EMNLP, 2024

TL;DR Paper Bib

A system that can successfully guide humans in simulated residential environments despite generating potentially inaccurate instructions
@inproceedings{zhao2024successfully, title = {Successfully Guiding Humans with Imperfect Instructions by Highlighting Potential Errors and Suggesting Corrections}, author = {Zhao, Lingjun and X. Khanh, Nguyen and Daum{\'e} III, Hal}, booktitle = {EMNLP}, year = {2024}, }

A framework for learning to request rich and contextually useful information from humans

Khanh Nguyen, Yonatan Bisk, and Hal Daumé III

ICML, Jul 2022

TL;DR Paper Bib

@inproceedings{nguyen2022hari,
  author = {Nguyen, Khanh and Bisk, Yonatan and Daum{\'e} III, Hal},
  title = {A framework for learning to request rich and contextually useful information from humans},
  booktitle = {ICML},
  month = jul,
  year = {2022},
}

Interactive learning from activity description

Khanh Nguyen, Dipendra Misra, Robert Schapire, Miro Dudı́k, and Patrick Shafto

ICML, Jul 2021

TL;DR Paper Bib

@inproceedings{nguyen2021iliad,
  title = {Interactive learning from activity description},
  author = {Nguyen, Khanh and Misra, Dipendra and Schapire, Robert and Dud{\'\i}k, Miro and Shafto, Patrick},
  booktitle = {ICML},
  year = {2021},
}

Help, Anna! Visual navigation with natural multimodal assistance via retrospective curiosity-encouraging imitation learning

Khanh Nguyen, and Hal Daumé III

EMNLP, 2019

TL;DR Paper Bib

@inproceedings{nguyen2019hanna,
  author = {Nguyen, Khanh and Daum{\'e} III, Hal},
  title = {Help, Anna! Visual navigation with natural multimodal assistance via retrospective curiosity-encouraging imitation learning},
  booktitle = {EMNLP},
  month = {},
  year = {2019},
}

Reinforcement learning for bandit neural machine translation with simulated human feedback

Khanh Nguyen, Hal Daumé III, and Jordan Boyd-Graber

EMNLP, Sep 2017

TL;DR Abs Paper Bib

Improve machine translation with reinforcement learning from noisy ratings

Machine translation is a natural candidate problem for reinforcement learning from human feedback: users provide quick, dirty ratings on candidate translations to guide a system to improve. Yet, current neural machine translation training focuses on expensive human-generated reference translations. We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback. Our algorithm combines the advantage actor-critic algorithm (Mnih et al., 2016) with the attention-based neural encoder-decoder architecture (Luong et al., 2015). This algorithm (a) is well-designed for problems with a large action space and delayed rewards, (b) effectively optimizes traditional corpus-level machine translation metrics, and (c) is robust to skewed, high-variance, granular feedback modeled after actual human behaviors.
@inproceedings{nguyen2017banditnmt, title = {Reinforcement learning for bandit neural machine translation with simulated human feedback}, author = {Nguyen, Khanh and Daum{\'e} III, Hal and Boyd-Graber, Jordan}, booktitle = {EMNLP}, month = sep, year = {2017}, address = {Copenhagen, Denmark}, publisher = {Association for Computational Linguistics}, url = {https://www.aclweb.org/anthology/D17-1153}, doi = {10.18653/v1/D17-1153}, pages = {1464--1474}, }

Posterior calibration and exploratory analysis for natural language processing models

Khanh Nguyen, and Brendan O’Connor

EMNLP, Sep 2015

TL;DR Paper Bib

@inproceedings{nguyen15calibration,
  title = {Posterior calibration and exploratory analysis for natural language processing models},
  author = {Nguyen, Khanh and O{'}Connor, Brendan},
  booktitle = {EMNLP},
  year = {2015},
  address = {Lisbon, Portugal},
  publisher = {Association for Computational Linguistics},
  url = {https://www.aclweb.org/anthology/D15-1182},
  doi = {10.18653/v1/D15-1182},
  pages = {1587--1598},
}