Why is human-AI communication an important research topic?

Prologue

Since the inception of Artificial Intelligence (AI), researchers have imagined machines that can communicate like humans. Herbert Televox, the first humanoid robot invented in 1927 by Ron Wensley, was already equipped with the capability of speaking two simple sentences and taking actions based on human sound pitches. Alan Turing, the father of modern computer science, described his famous test of intelligence as an evaluation of an agent’s ability to hold natural conversations with humans (Turing, 1950). Later benchmarks and definitions of AI (Winograd, 1971; Chollet, 2019; Levesque et al., 2012; Johnson et al., 2017; Sakaguchi et al., 2020) have consistently placed strong emphasis on matching the ways humans generate and process information.

More than a mere aspiration to mirror human behavior, equipping AI agents with effective, humancompatible communication capabilities serves a practically important goal: to enable these agents to effectively serve and aid humans (Marge et al., 2022). While current AI agents are showing tremendous potential to uplift the human society in various ways, in order for these agents to be adopted by humans, researchers need to endow them with features that make them helpful and safe for humans. Among many demanded features, the ability to effectively connect and establish mutual understanding with any human is one of the most important. Specifically, to be helpful for human users, an agent must understand what they want, being able to infer intentions and extract knowledge from their utterances. On the other hand, to safely serve the users, the agent should help them predict and regulate its behaviors, by conveying its (un)certainties to them in a human-intelligible way and proactively consulting them when facing difficult choices.

Many failures of AI in research and real-world conditions have suggested agents equipped with insufficient capabilities of communicating with humans are dangerously brittle (Angwin et al., 2016; Technica, 2016; Buolamwini & Gebru, 2018; Feng et al., 2018; Wallace et al., 2019; Hernandez, 2021; Shridhar et al., 2020). In these accounts, the power of training and modifying the agent was placed in the hands of a small group of people. The regular users, who were the main beneficiaries and risk-takers, were equipped with very limited mechanisms to harness the agent, broaden its knowledge, and neutralize the risk that it presented. The lack of communication between the agent and the human users led to both sides failing to establish mutual understanding: the agent misinterpreted what the users needed and took inapt actions; the users could not explain and anticipate the agent’s behavior, thereby distrusting it. To make AI agents more helpful and safe for regular users, the current technology needs to be changed, towards empowering human users with more control of AI agents (Amershi et al., 2014). Developing frameworks that support natural, rich, and faithful human-AI communication can accelerate progress towards this goal. Specifically, equipping AI agents with the ability to learn from natural communication with humans would allow them to be programmed by nonexpert users, thereby making them more useful for those users. Meanwhile, granting the agents the ability to articulate specific uncertainties and to follow human advice would significantly reduce the risk of them committing costly mistakes.

Enhancing human-AI communication would also address the redistribution of social labor due to replacing humans with AI agents in current workflows. Substituting humans with fully autonomous agents at a large scale would put a substantial fraction of the human workforce into temporary unemployment, which could eventually lead to social turbulence if these humans would not be quickly migrated to new occupations. Many leading scientists and policy makers agree that a sustainable path for integrating AI into our society is to set human-centered goals, focusing on creating technologies that amplify human efficiency and impact (Riedl, 2019; Xu, 2019; Shneiderman, 2020; Pretz, 2021; Shang & Du, 2021; UNESCO, 2021). The success of this strategy is predicated on whether researchers can design AI agents that can coordinate easily and successfully with regular human workers.

The Current State of Human-AI Communication

Despite closing performance gaps quickly with humans in various domains, current AI agents possess limited capabilities of communicating with humans. A common paradigm for developing these agents typically involves a dataset- or simulator-based training phase followed by a full-autonomy evaluation phase. During the training phase, agents are trained on a large-scale dataset or in a simulator that can generate an infinite amount of data points. During the evaluation phase, the agents’ model parameters are locked and they are tested with previously unseen tasks. The agents execute the tasks on their own, until they choose to terminate or a time or computing budget is exceeded. Finally, the agent that achieves the highest success rate is selected for deployment. In this paradigm, an agent almost never interacts directly with a human, except in certain problems where it has to converse with humans as an intrinsic requirement of the task (e.g., a chatbot). Because there are effective ways to optimize the training and evaluation objectives in this paradigm without communicating with humans, the finally selected agent usually lacks effective, human-compatible communication capabilities, including the ability to learn directly from humans through natural communication, to convey uncertainties in a human-intelligible way, or to request and interpret human advice.

Limitations of Dataset/Simulator-Based Training Frameworks

While a plethora of algorithms and model architectures has been proposed, the foundational learning frameworks in AI have largely been unchanged. Since the early days, two learning frameworks, bandit/reinforcement learning (Li et al., 2010; Sutton & Barto, 2018) and supervised/imitation learning (Pomerleau, 1988; Ross, 2013), have primarily been carrying the progress of AI. While these frameworks have powered remarkable achievements, their designs are incompatible with the human ways of teaching. Specifically, they allow communication through only highly primitive media: numerical rewards in reinforcement learning and low-level action labels imitation learning. Learning signals expressed in these media can be easily harvested or artificially simulated, allowing exposing the agent an enormously diverse set of data points. While this approach has been effective at injecting foundational knowledge, agents eventually need to be able to directly learn from human users their personal needs and preferences. In that case, communicating through rewards or low-level action labels restricts the users’ ability to convey complex, abstract concepts that they can normally effectively articulate via their natural languages. Learning through narrow communication channels can be inefficient and ineffective, as the information received by the learner may contain incomplete, ambiguous signals about the teacher’s intentions.

Limitations of Full-Autonomy Evaluation Frameworks

Most current agents not only have restricted capability of learning directly from humans, but also are not built with incentives and skills to generate and convey information to humans. A highly essential skill that agents should acquire is the ability to ask questions to extract relevant knowledge from humans. As the world is constantly changing, any agent would certainly face problems that are beyond its autonomous problem-solving capabilities. An agent that relied solely on its built-in knowledge would not be maximally helpful and even unsafe for serving humans. Facing a problem that it is not prepared for, the agent may perpetrate harmful decisions without warnings. Lacking communication skills, it neglect opportunities to make safer and more effective decisions by simply asking and following instructions given by human bystanders and supervisors. Unfortunately, the full-autonomy evaluation framework, which praises an agent based solely on its the ability to perform tasks on its own, is actively promoting these kinds of high-risk behavior. While improving autonomous performance of an agent is important to enhance human productivity, focusing solely on this metric may misguide the development of AI that can actually benefit humans.

Efforts to Enhance Human-AI Communication

New subfields of AI have been founded for the goal of democratizing AI and enhancing its utility for regular human users. Explainable AI (Emmert-Streib et al., 2020; Samek et al., 2019) extracts human-intelligible signals about the decision making of an agent to make it more transparent and trustworthy to humans. AI security (Hendrycks et al., 2021; Amodei et al., 2016) studies techniques for strengthening this technology against attacks from bad humans. Fair and just AI (Barocas et al., 2019; Doshi-Velez & Kim, 2017; Mehrabi et al., 2021) focuses on detecting and preventing the negative social impacts of AI on humans. Human-in-the-loop AI (Fails & Olsen Jr, 2003; Laird et al., 2017; Wang et al., 2021; Zanzotto, 2019) finds ways to incorporate humans into the development and operation of AI systems. Overall, these efforts have made AI agents significantly safer and more approachable for non-expert users. But despite these progresses, the communication bottlenecks in the traditional learning and evaluation frameworks remain largely unaddressed. In fact, the currently dominating trend in the field is to continue exploiting the traditional frameworks to their limits, proposing benchmarks that challenge agents’ autonomous decisions-making capabilities, and training agents with cheap, simple forms of learning signals to boost performance on those benchmarks (OpenAI, 2023; Touvron et al., 2023). While this trend has produced agents that are extremely successful in their trained domains, those agents would eventually need to be able to be connected with humans to benefit them. Nevertheless, effective, human-compatible communication capabilities that are requisite for accomplishing that goal will not emerge unless the learning and evaluation criteria intentionally seek for those capabilities.

Challenges in Advancing Human-AI Communication Research

Incorporating interactions with humans into the traditional learning and evaluation frameworks is met with great challenges.

The theoretical challenge is to design an end-to-end optimization framework that can accommodate the vast diversity and complexity of human communicative activities. For primitive communication media like reward or low-level action label, the learning objective is trivial to determine: maximizing the reward or minimizing the imitation gap. But for human utterances, it is unclear what is the single objective that an agent should optimize for and how to integrate diverse types of language feedback into an agent’s model.

The practical challenge is to reduce cost and risk of conducting experiments with real humans. Currently, scaling up experiments with real humans to a large human population and hundreds thousands of episodes requires an immense budget that most academia research groups cannot afford. It is also an extremely risky to allow many humans to interact directly with agents that are still under development. Hence, careful design and review processes are needed, further increasing the time and cost of performing these experiments.

References

Turing, A. (1950). Computing machinery and intelligence. Mind, 59(236), 433.
Winograd, T. (1971). Procedures as a representation for data in a computer program for understanding natural language. MASSACHUSETTS INST OF TECH CAMBRIDGE PROJECT MAC.
Chollet, F. (2019). On the measure of intelligence. ArXiv Preprint ArXiv:1911.01547.
Levesque, H., Davis, E., & Morgenstern, L. (2012). The winograd schema challenge. Thirteenth International Conference on the Principles of Knowledge Representation and Reasoning.
Johnson, J., Hariharan, B., van der Maaten, L., Fei-Fei, L., Lawrence Zitnick, C., & Girshick, R. (2017). Clevr: A diagnostic dataset for compositional language and elementary visual reasoning. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2901–2910.
Sakaguchi, K., Le Bras, R., Bhagavatula, C., & Choi, Y. (2020). Winogrande: An adversarial winograd schema challenge at scale. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8732–8740.
Marge, M., Espy-Wilson, C., Ward, N. G., Alwan, A., Artzi, Y., Bansal, M., Blankenship, G., Chai, J., Daumé III, H., Dey, D., & others. (2022). Spoken language interaction with robots: Recommendations for future research. Computer Speech & Language, 71, 101255.
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. In Ethics of Data and Analytics (pp. 254–264). Auerbach Publications.
Technica, A. (2016). Tay, the neo-Nazi millennial chatbot, gets autopsied. https://arstechnica.com/information-technology/2016/03/tay-the-neo-nazi-millennial-chatbot-gets-autopsied/
Buolamwini, J., & Gebru, T. (2018). Gender shades: Intersectional accuracy disparities in commercial gender classification. Conference on Fairness, Accountability and Transparency, 77–91.
Feng, S., Wallace, E., Grissom II, A., Iyyer, M., Rodriguez, P., & Boyd-Graber, J. (2018). Pathologies of neural models make interpretations difficult. ArXiv Preprint ArXiv:1804.07781.
Wallace, E., Feng, S., Kandpal, N., Gardner, M., & Singh, S. (2019). Universal Adversarial Triggers for Attacking and Analyzing NLP. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2153–2162. https://doi.org/10.18653/v1/D19-1221
Hernandez, D. (2021). IBM’s Retreat From Watson Highlights Broader AI Struggles in Health. The Wall Street Journal. https://www.wsj.com/articles/ibms-retreat-from-watson-highlights-broader-ai-struggles-in-health-11613839579
Shridhar, M., Thomason, J., Gordon, D., Bisk, Y., Han, W., Mottaghi, R., Zettlemoyer, L., & Fox, D. (2020). Alfred: A benchmark for interpreting grounded instructions for everyday tasks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 10740–10749.
Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. Ai Magazine, 35(4), 105–120.
Riedl, M. O. (2019). Human-centered artificial intelligence and machine learning. Human Behavior and Emerging Technologies, 1(1), 33–36.
Xu, W. (2019). Toward human-centered AI: a perspective from human-computer interaction. Interactions, 26(4), 42–46.
Shneiderman, B. (2020). Human-centered artificial intelligence: Reliable, safe & trustworthy. International Journal of Human–Computer Interaction, 36(6), 495–504.
Pretz, K. (2021). Michael I. Jordan explains why today’s artificial-intelligence systems aren’t actually intelligent. IEEE Spectrum. https://spectrum.ieee.org/stop-calling-everything-ai-machinelearning-pioneer-says
Shang, K. K., & Du, R. R. (2021). Disciplining Artificial Intelligence Policies: World Trade Organization Law as a Sword and a Shield (pp. 274–292). https://doi.org/10.1017/9781108954006.015
UNESCO. (2021). Draft text of the Recommendation on the Ethics of Artificial Intelligence. Intergovernmental Meeting of Experts (Category II) Related to a Draft Recommendation on the Ethics of Artificial Intelligence.
Li, L., Chu, W., Langford, J., & Schapire, R. E. (2010). A contextual-bandit approach to personalized news article recommendation. Proceedings of the 19th International Conference on World Wide Web, 661–670.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Pomerleau, D. A. (1988). Alvinn: An autonomous land vehicle in a neural network. Advances in Neural Information Processing Systems, 1.
Ross, S. (2013). Interactive Learning for Sequential Decisions and Predictions [PhD thesis, Carnegie Mellon University]. https://doi.org/10.1184/R1/6720269.v1
Emmert-Streib, F., Yli-Harja, O., & Dehmer, M. (2020). Explainable artificial intelligence and machine learning: A reality rooted perspective. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(6), e1368.
Samek, W., Montavon, G., Vedaldi, A., Hansen, L. K., & Müller, K.-R. (2019). Explainable AI: interpreting, explaining and visualizing deep learning (Vol. 11700). Springer Nature.
Hendrycks, D., Carlini, N., Schulman, J., & Steinhardt, J. (2021). Unsolved problems in ml safety. ArXiv Preprint ArXiv:2109.13916.
Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. ArXiv Preprint ArXiv:1606.06565.
Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and Machine Learning. fairmlbook.org.
Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. ArXiv Preprint ArXiv:1702.08608.
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., & Galstyan, A. (2021). A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), 54(6), 1–35.
Fails, J. A., & Olsen Jr, D. R. (2003). Interactive machine learning. Proceedings of the 8th International Conference on Intelligent User Interfaces, 39–45.
Laird, J. E., Gluck, K., Anderson, J., Forbus, K. D., Jenkins, O. C., Lebiere, C., Salvucci, D., Scheutz, M., Thomaz, A., Trafton, G., & others. (2017). Interactive task learning. IEEE Intelligent Systems, 32(4), 6–21.
Wang, Z. J., Choi, D., Xu, S., & Yang, D. (2021). Putting humans in the natural language processing loop: A survey. ArXiv Preprint ArXiv:2103.04044.
Zanzotto, F. M. (2019). Human-in-the-loop artificial intelligence. Journal of Artificial Intelligence Research, 64, 243–252.
OpenAI. (2023). GPT-4 Technical Report. ArXiv Preprint ArXiv:2303.08774.
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., & others. (2023). Llama 2: Open foundation and fine-tuned chat models. ArXiv Preprint ArXiv:2307.09288.