I am a postdoc researcher in the NLP group at Princeton University. I am grateful to work with Prof. Karthik Narasimhan and his awesome students. My research focuses on various topics of Human-AI Communcation. I deeply care about making AI agents beneficial while retaining human control on them. The premises of my research are:
Effective communication with humans magnifies AI capability while reducing its risk;
Communication is not merely about imitating human language. It is about how to express thoughts, comprehend intentions, and accomplishing goals through (any) language.
My work aims to push forward three directions:
Learning from natural human feedback: Traditional learning frameworks employ primitive, idealized learning signals as communication media, thus limiting the ability of AI agents to learn directly from humans. How can we enable AI agents to improve themselves given diverse types of learning signals that humans can provide? Towards this goal, I have built agents that learn from noisy ratings [EMNLP’17] and language descriptions [ICML’21].
Learning to express uncertainties: It is a mistake to think that only humans should ask AI for help and not the reverse case. By asking a question, an agent can: (i) express its uncertainties (not just uncertainty), (ii) obtain information to expand its task-solving capabilities. So more safety and more utility! But how to teach agents when and what to ask? I author a series of papers which develop methods and highlight the challenges of this problem [EMNLP’15’, CVPR’19, EMNLP’19, ICML’22].
Probing and modeling human cognitive capabilities: Goal-regulated models like ChatGPT is better at comprehending humans than behavior-cloned models like GPT. One possible hypothesis is that the former behaves more like a human: by optimizing for an internal reward function, it thinks about “what to achieve in the long term?” rather than “what do in the next moment?” Are there more aspects about the human cognitive system that can inspire useful principles for developing AI? Which human cognitive capabilities that AI agents lack? Which would empower AI and which would be redundant? Which can be learned and which need to be built? I investigate some of these questions in the context of instruction generation [PIGen’23].
Facts about me:
I obtained my PhD at the University of Maryland–College Park, advised by the awesome Hal Daumé III.
My real name is Nguyễn Xuân Khánh . My first name is usually confused with Khan or Kahn :(
I was born in Việt Nam , a peaceful country (click here for inspiration to visit us).
I am also proud to be a PTNK (Phổ Thông Năng Khiếu) alumnus.
@inproceedings{nguyen2019hanna,author={Nguyen, Khanh and Daum{\'e} III, Hal},title={Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning},booktitle={EMNLP},month={},year={2019},}
Interactive Learning from Activity Description
Khanh Nguyen, Dipendra Misra, Robert Schapire, and 2 more authors
@inproceedings{nguyen2021iliad,title={Interactive Learning from Activity Description},author={Nguyen, Khanh and Misra, Dipendra and Schapire, Robert and Dud{\'\i}k, Miro and Shafto, Patrick},booktitle={ICML},year={2021},}
Posterior calibration and exploratory analysis for natural language processing models
@inproceedings{nguyen15calibration,title={Posterior calibration and exploratory analysis for natural language processing models},author={Nguyen, Khanh and O{'}Connor, Brendan},booktitle={EMNLP},month=sep,year={2015},address={Lisbon, Portugal},publisher={Association for Computational Linguistics},url={https://www.aclweb.org/anthology/D15-1182},doi={10.18653/v1/D15-1182},pages={1587--1598},}
Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback
Khanh Nguyen, Hal Daumé III, and Jordan Boyd-Graber
Machine translation is a natural candidate problem for reinforcement learning from human feedback: users provide quick, dirty ratings on candidate translations to guide a system to improve. Yet, current neural machine translation training focuses on expensive human-generated reference translations. We describe a reinforcement learning algorithm that improves neural machine translation systems from simulated human feedback. Our algorithm combines the advantage actor-critic algorithm (Mnih et al., 2016) with the attention-based neural encoder-decoder architecture (Luong et al., 2015). This algorithm (a) is well-designed for problems with a large action space and delayed rewards, (b) effectively optimizes traditional corpus-level machine translation metrics, and (c) is robust to skewed, high-variance, granular feedback modeled after actual human behaviors.
@inproceedings{nguyen2017banditnmt,title={Reinforcement Learning for Bandit Neural Machine Translation with Simulated Human Feedback},author={Nguyen, Khanh and Daum{\'e} III, Hal and Boyd-Graber, Jordan},booktitle={EMNLP},month=sep,year={2017},address={Copenhagen, Denmark},publisher={Association for Computational Linguistics},url={https://www.aclweb.org/anthology/D17-1153},doi={10.18653/v1/D17-1153},pages={1464--1474},}
A Framework for Learning to Request Rich and Contextually Useful Information from Humans
@inproceedings{nguyen2022hari,author={Nguyen, Khanh and Bisk, Yonatan and Daum{\'e} III, Hal},title={A Framework for Learning to Request Rich and Contextually Useful Information from Humans},booktitle={ICML},month=jul,year={2022},}