Prior to joining Microsoft, I was a Postdoctoral Research Fellow of the Center for Human-Compatible Artificial Intelligence (CHAI) at the University of California—Berkeley, where I was fortunate to be mentored by Prof. Stuart Russell, who co-wrote the best-selling Introduction textbook on AI and invented the mathematical foundations for human-AI alignment. Before that, I was a postdoc at the Princeton NLP group working with Prof. Karthik Narasimhan, a pioneer in AI agents. I obtained my PhD at the University of Maryland—College Park, advised by the great Hal Daumé III.

My research statement summarizes my research accomplishments and vision. At a high level, I create artificial agents that have the communication skills and incentives to assist humans. Specifically, I explore the following questions:

  • How to enable AI agents to learn from natural human feedback (listening skill): My [EMNLP’17] paper demonstrated for the first time the feasibility of using only noisy, complete-output ratings to improve the performance of a neural text generator. This work was followed by studies that used real human ratings at eBay and OpenAI, ultimately leading to the development of InstructGPT that popularized RLHF.
    More recently, I have been developing frameworks for learning from language feedback with theoretical guarantees [ICML’21, ACL’24WS].
  • How to identify and share with humans what AI agents know and do not know (speaking skill): I was an early explorer of calibration analysis for NLP models [EMNLP’15’] and pioneered the development of robots that ask for help [CVPR’19, EMNLP’19, ICML’22]. Lately, I develop models that guide human navigation with language, improving their pragmatic reasoning capability [ACL’23] and making them useful even when they generate inaccurate instructions [EMNLP’24].
  • How to drive AI agents toward efficient and beneficial communicative behavior (incentive): I create agents that learn with progressive efficiency [NeurIPS’23WS], i.e. the more you talk to them, the less effort it will take to teach them. In an ongoing work, I characterize the limitations of the popular RLHF approach and propose a new alignment framework that emphasizes alignment with not only with the human principal but also with reality.

Some personal facts:

  • My real name is Nguyễn Xuân Khánh 📢. My first name (Khánh) means “joy” or “happiness”. Please do not confuse it with Khan or Kahn :(
  • I was born in Việt Nam :vietnam:, a peaceful country (click here for inspiration to visit us).
  • I am also proud to be a PTNK (Phổ Thông Năng Khiếu) alumnus.