AI-Human Interaction Standard
Non-normative
Work in progress
V0.1.47
W.R.
The AI-Human Interaction Standard aims to establish principles and guidelines for effective collaboration between humans and Large Language Models (LLMs) for cooperative tasks in software environments.
This is a living document and will be updated over time. This information is intended to serve as a starting point for discussion between developers, researchers, and other stakeholders.
TABLE OF CONTENTS
Scope
The AI-Human Interaction Standard considers factors that may impact the safety, security, and reliability of LLMs as they are integrated into society.
The scope of this standard is limited to the following areas:
1 | Accuracy Language models are prone to a variety of failure modes that make them untrustworthy and potentially dangerous in mission-critical scenarios where human lives are at stake. |
2 | Coordination By default, human-to-machine interactions with language models inherit communication failure modes that are present in human-to-human interactions, including information omission, misunderstanding, and misinterpretation. |
3 | Compatibility Software applications designed to be used by humans will likely require modification to support LLM agents, but these requirements are not formally documented and may be overlooked by developers. |
4 | Bad Actors Malicious users may leverage LLMs to exploit vulnerabilities for nefarious purposes including phishing, fraud, and identity theft. LLMs have capabilities that can be used to jeopardize the security of individuals and organizations in ways that may be difficult to detect. |
5 | Privacy LLM agents are able to access and process large amounts of data, which will likely introduce privacy concerns beyond the scope of modern regulatory standards. |
Goals
1 | Operational Guidelines Provide a set of generalized best practices to reduce catastrophic failure due to miscommunication and assumptions between language models and humans. |
2 | Technical Standards Establish protocol specifications to enable intelligent agents to reliably and safely interact with web applications by providing agent-specific localized context. Initial Spec: Action Descriptor Tag Initial Spec: Agent Actions and Permissions Protocol |
Why is this important?
1 | Humans will soon work alongside language models in real-world scenarios As language models are deployed in increasingly complex production environments, humans will be responsible for monitoring and managing language models to ensure that they do not cause harm. |
2 | Humans and language models interpret information in fundamentally different ways This can lead to a variety of communication failure modes involving misunderstandings by both parties. |
3 | Novel challenges arise when humans and language models work together Each party operates with a different set of explicit and implicit information. Effective collaboration requires both parties to stay in sync and to be aware if the other party is making incorrect assumptions. |
4 | Existing standards/protocols were not designed for language models Conventional software for humans was not designed with language models in mind. Language models require additional context to perform complex tasks, and existing protocols do not provide a reliable way to inform language models of this context. |
Why now?
By establishing standard protocols and guidelines before the widespread adoption of LLMs, we can help prevent a fragmented ecosystem of incompatible solutions to the same problems. Well-defined technical specifications provide foundational guidance for the next generation of software services and applications, promoting interoperability and reducing the risk of catastrophic failure.
The protocol decisions made today will set the precedent for how we interact with intelligent agents in the future, and will likely have a significant impact on the future of work, education, and society as a whole.
Background
Software is eating the world and AI is eating software.
More specifically, Large Language Models (LLMs) are eating software because they are now capable of reasoning at a level roughly equivalent to that of a human expert in many domains of knowledge.
See GPT-3 and T5 for examples of LLMs that can perform tasks that were previously only possible with professional human expertise.
1 | Intelligent decision-making is now available via API call In the same way that serverless computing made it possible to build applications without managing infrastructure, LLMs now make it possible to make intelligent decisions without managing a team of human experts. |
2 | LLMs can reduce labor costs and increase operational efficiency Many tasks that were previously performed by humans can now be performed by LLMs, which can increase ROI and reduce costs. |
3 | LLMs can provide a competitive advantage for first movers Organizations that do not adopt LLMs will be at a disadvantage relative to those that do, and will eventually be forced to adopt LLMs to remain competitive. |
Competitive pressures may lead to an LLM arms race, where organizations push the boundaries of what is possible with LLMs by using them in increasingly complex and challenging environments to gain an edge in the market. The incentives for organizations to adopt LLMs will likely continue to increase as the technology improves and the cost of LLMs continues to decrease.
A brief history of language models
The past decade has seen a rapid increase in the size and capabilities of language models, and the pace of progress continues to accelerate.
Timeline | |
---|---|
2017 | Attention is all you need validates the idea that language models can rely solely on attention mechanisms without the need for recurrent or convolutional layers for machine translation and other natural language processing tasks. This paper introduces the transformer architecture, which revolutionizes the field of NLP and lays the foundation for powerful models like BERT, GPT, and T5. |
2018 | BERT achieves state-of-the-art performance on 11 NLP tasks, including question answering and text classification. BERT (Base) has 110-million parameters, and BERT (Large) has 340-million parameters. |
2020 | GPT-3 demonstrates that transformers can scale performance by adding more layers and more parameters. GPT-3 (OPT-175B) has 175 billion parameters, and performs tasks that were previously considered to be the domain of humans, such as reading comprehension, question answering, and text generation. |
2022 | ChatGPT takes the world by storm, making a splash in the media and becoming a viral sensation among the tech community and the general public. |
2023 | Bing Chat provides an LLM-enhanced internet search experience that is capable of answering questions using natural language and providing relevant information from the web. |
2023 | GPT-3.5 costs 1/10th the price of GPT-3, which supports a wider range of applications and is more accessible to the general public. |
2023 | GPT-4 achieves benchmark performance at a level comparable to a well-educated human across a wide variety of disciplines, including math, physics, chemistry, biology, and computer science. |
The trajectory of recent progress indicates that LLMs will likely continue to improve in the coming years, and will eventually reach and exceed human-level understanding in many domains of knowledge.
What happens next?
Connecting LLMs directly to applications is the next step in the evolution of software. This will allow humans to interface with tools and services in a way that is more efficient than manually typing/clicking via graphical user interface (GUI).
Adept is building a transformer model that can use a web browser to interact with web pages and perform tasks just like a human.
"ACT-1 is a large-scale Transformer trained to use digital tools — among other things, we recently taught it how to use a web browser. Right now, it’s hooked up to a Chrome extension which allows ACT-1 to observe what’s happening in the browser and take certain actions, like clicking, typing, and scrolling, etc. The observation is a custom “rendering” of the browser viewport that’s meant to generalize across websites, and the action space is the UI elements available on the page."
OpenAI released ChatGPT plugins, which allow developers to build LLM-based tools that interface with web applications, third-party APIs, and a variety of other services.
"Plugins are tools designed specifically for language models with safety as a core principle, and help ChatGPT access up-to-date information, run computations, or use third-party services."
LLMs will soon be interacting with software in a way that augments and extends the capabilities of human users. A considerable portion of global GDP involves humans interacting with information systems, and LLMs will soon be able to perform many of these tasks with greater speed, accuracy, and efficiency than humans. It is likely that this technology will have a major impact on the way we work, learn, play, and live.
Status quo disruption
Current web infrastructure is not ready for LLM agents. These agents represent a new type of user that doesn't neatly fit into existing categories of humans or traditional bots. As a result, existing web standards do not specifically address their characteristics, and many of the protocols and practices that are currently in use are not designed to support them.
LLM agents can reason like a human while operating at the speed and efficiency of a bot. This combination makes them particularly well suited for interacting with information systems, but it also poses a number of unique challenges that must be addressed before they can be safely deployed at scale.
OpenAI ChatGPT plugins use a variety of techniques intended to mitigate potential safety risks that LLM agents pose to the web ecosystem. However, these techniques are limited by the web standards and protocols that are currently in use.
"To respect content creators and adhere to the web’s norms, our browser plugin’s user-agent token is ChatGPT-User and is configured to honor websites' robots.txt files."
User-agents | User-agents are strings of text sent by web browsers, crawlers, or other software clients to web servers, identifying the client's type, version, and sometimes operating system. This information helps servers to tailor content, appearance, or functionality to suit the specific client, improving the user experience or ensuring proper functionality. |
Robots.txt | Robots.txt is a text file used by website administrators to communicate with web crawlers and other automated agents, providing instructions on which parts of the website should or should not be accessed and indexed. It serves as a guideline for search engines and other bots, helping to prevent the crawling of sensitive or irrelevant content. |
"We have also published our IP egress ranges. Additionally, rate-limiting measures have been implemented to avoid sending excessive traffic to websites."
IP egress ranges | IP egress ranges refer to the set of IP addresses from which outbound network traffic originates. In the context of web services and applications, these ranges are often used to define a group of IP addresses that belong to a specific organization or service, enabling network administrators to apply access controls, monitor traffic, or enforce security policies. |
Rate-limiting | Rate-limiting is a technique used to control the frequency of requests made to a server or API, preventing excessive use or abuse of resources. By imposing limits on the number of requests per user, per IP address, or per client within a specified time window, rate-limiting helps maintain server stability, reduces the risk of denial-of-service (DoS) attacks, and ensures fair access to resources for all users. |
The code below shows an example of a robots.txt file used by the ChatGPT plugin to prevent the plugin from interacting with a specific website:
1 User-agent: ChatGPT-User
2 Disallow:
3 Allow: /directory-1/
4 Allow: /directory-2/
The methods described above are not a real solution, they just are a temporary workaround that OpenAI has chosen to implement in order to protect its reputation as a commercial entity and mitigate legal liability. OpenAI can moderate the behavior of its LLM agents by restricting their access to specific websites. But these measures do not address the root cause of the problem, which is the lack of web standards that specifically address the safety and security risks posed by LLM agents.
These measures do not apply to any other LLM agents. If a malicious actor were to deploy their own LLM agent, they would have no incentives to implement these techniques. They could use a false user-agent string, avoid IP detection via proxy servers, and ignore the robots.txt file altogether. They would be free to perform nefarious activities on the web without any self-imposed restrictions, and they would be able to do so at a much faster rate than a human user.
To mitigate the immediate risks posed by LLM agents, web standards must be updated. This will require the collaboration of the web community, and it will take time to discuss and implement the necessary changes.
Failure modes
LLMs are not reliable yet. Current generation LLMs have a tendency to confidently deliver coherent yet factually incorrect output. In some scenarios, the errors may be hard to detect by human users unless they are familiar with the subject matter and have time to carefully review the output. This is especially true for users that are inexperienced with language models, as they are often unfamiliar with the common failure modes to look out for.
OpenAI's work with InstructGPT is described in the paper Training language models to follow instructions with human feedback which contains examples of instruction-related failure modes.
"Making language models bigger does not inherently make them better at following a user's intent. For example, large language models can generate outputs that are untruthful, toxic, or simply not helpful to the user. In other words, these models are not aligned with their users. In this paper, we show an avenue for aligning language models with user intent on a wide range of tasks by fine-tuning with human feedback. Starting with a set of labeler-written prompts and prompts submitted through the OpenAI API, we collect a dataset of labeler demonstrations of the desired model behavior, which we use to fine-tune GPT-3 using supervised learning."
There are also a variety of novel failure modes that are specific to LLMs. These failure modes can lead to unexpected and undesirable behaviors that can also be used for malicious purposes by bad actors.
Title | Summary |
---|---|
Anomalous Tokens | Out-of-distribution tokens lead to non-deterministic output at 0 temperature |
Prompt Injection | "Sideloading" information into Bing Chat via web page text |
Mode Collapse | Strange behaviors and favorite random numbers |
New LLM architectures, training methods, and data sets may introduce new failure modes that are not present in the current generation of LLMs.
How do we handle failure modes?
Considerable effort has been made to understand and mitigate unwanted behavior through RLHF (Reinforcement Learning with Human Feedback) and related techniques. However, these techniques are not a silver bullet and they do not guarantee that LLMs will not exhibit unwanted behavior under all circumstances.
There is no generalized methodology to prevent LLM failures in a production environment. This is currently a major obstacle that is slowing the adoption of LLMs in safety-critical domains such as healthcare, finance, and law, where the consequences of failure can be severe.
Implicit information
Language models are trained on massive amounts of data, and like humans, they are often able to derive the implied meaning of words and phrases by observing how they are used in context. But this is not always the case.
By using natural language to communicate with LLMs, we are bound by the limitations of language itself. English is full of lossy semantic compression. Over many years, humans have compressed context in a way that provides the optimal balance of specificity and efficiency for our interactions with each other. We offload the burden of ambiguity to the context of the conversation, and we rely on the shared knowledge of the participants to fill in the gaps. This is generally not a problem in conversations between humans, but it can lead to confusion when interacting with LLMs that are not aware of context or locally defined information.
As we use LLMs to perform tasks that are more complex and require more context, we must also be more explicit in our communication with them. We can't just tell LLMs what we want them to do, we have to tell them how to do it, or they will make assumptions about what we mean.
Here is an example of implicit information contained in a natural language command, and how it can lead to failure if the assumptions are incorrect.
Human command:
"Schedule a meeting with the marketing team next week."
Explicit | Implicit | Intent |
---|---|---|
Schedule a meeting | Schedule it now | Create an event in the calendar application |
With the marketing team | Marketing team members | Identify the members of the marketing team and include them in the meeting invite |
Next week | Meeting is next week | Use the current date and time to determine the correct week for the meeting |
Expected Behavior:
1 | Meeting should be during normal business hours. |
2 | Meeting should be on a weekday. |
3 | Default duration of the meeting (e.g., 1 hour) is acceptable. |
4 | Agent has the necessary information to identify the marketing team members. |
5 | Agent has access to the calendar application and can schedule meetings on behalf of the user. |
Potential Failure Modes
1 | Agent schedules the meeting outside of normal business hours or on a weekend, causing inconvenience for the attendees. |
2 | Agent misunderstands the term "next week" and schedules the meeting in the current week or further in the future. |
3 | Agent incorrectly identifies the members of the marketing team, resulting in some being left out or non-members being added to the meeting. |
4 | Agent schedules the meeting for an inappropriate duration, either too short or too long, causing the meeting to be less productive. |
5 | Agent schedules the meeting in a time zone different from the majority of attendees, leading to confusion and missed attendance. |
6 | Agent fails to check the availability of the marketing team members before scheduling the meeting, resulting in conflicts and rescheduling. |
7 | Agent schedules the meeting without a clear agenda, causing confusion and lack of focus for the attendees. |
8 | Agent does not account for holidays or other company-wide events when scheduling the meeting, leading to potential scheduling conflicts. |
9 | Agent does not provide necessary details or attachments for the meeting, resulting in attendees being unprepared. |
10 | Agent experiences technical issues when accessing the calendar application, leading to a failure to schedule the meeting or multiple duplicate events being created. |
Human in the loop
Because LLMs can fail in surprising ways, it is important to have a human-in-the-loop to verify the output of the LLM before the LLM is allowed to perform actions related to the task.
There are 3 main ways that humans interact with software today:
API | Machine readable interface for software to communicate with other software. |
CLI | Machine readable interface for humans to communicate with software. |
GUI | Human readable interface for humans to communicate with software. |
-
LLMs can perform tasks via API/CLI, but they require human approval and verification to perform tasks.
-
Technical users are comfortable working with API/CLI tools, but non-technical users tend to strongly prefer to use a GUI to interact with software.
-
A significant portion of the global population is not comfortable working with API/CLI tools, so it is reasonable to assume that the GUI will likely be the primary interface for LLMs to interact with humans at scale in the future.
This leads to the following set of circumstances: Nontechnical users will manage intelligent agents performing actions on their behalf in 3rd party software applications via GUI, and these agents will also be interacting with humans via GUI.
This represents a new interaction paradigm. It is not yet clear how this interaction should be handled, so it is important to establish a standard to ensure that LLMs are used safely and effectively.
If a standard is not established, LLMs will likely be used in ways that are potentially harmful and almost certainly suboptimal.
Context and feedback
Context | Humans must provide context and verify the output of the LLM before the LLM is allowed to perform actions related to the task. |
Feedback | Humans must provide feedback to the LLM to improve its performance over time. |
Humans and LLMs will be interacting with one another in a variety of ways, so it is important to define the scope of the interaction to ensure that the correct standards are applied.
Example:
INTERACTION | DEFINED BY |
---|---|
How humans interact with humans | Conversational etiquette, cultural norms, local dialect |
How humans interact with applications | Authentication, role permissions, terms of service |
How humans interact with LLMs | Conversational etiquette, cultural norms, local dialect |
The scope of the interaction can be governed by external factors, such as the application that the LLM is interacting with, or it can be governed by the LLM itself.
INTERACTION | DEFINED BY |
---|---|
How LLMs interact with humans | RLHF, moderation filters, API level limits |
How LLMs interact with applications | Authentication, role permissions, terms of service |
How LLMs interact with LLMs | RLHF, moderation filters, API level limits |
Prompt engineering
Prompt engineering is a new field that involves designing natural language prompts that produce specific desired output from AI models. Natural language is full of ambiguity and implicit semantics, making prompt engineering a challenging task that is distinct from traditional programming.
To achieve clear communication with language models, we need to establish protocols capable of handling the complexity of human language in order to ensure that LLMs interpret prompts accurately.
Natural language is low specificity, code is high specificity. A hybrid of the two may produce the optimal solution for handling fuzzy logic and implicit semantics.
Verification and validation
Asking an LLM to explain how it came to a conclusion is a useful way to explore the model's reasoning, but this method does not necessarily validate the output.
Advanced techniques like Causal Tracing can be utilized to trace the flow of information through a model as described by Locating and Editing Factual Associations in GPT
"By isolating the causal effect of individual states within the network while processing a factual statement, we can trace the path followed by information through the network."
Concepts are represented as a set of tokens, and the causal flow of information is represented as a directed graph. This method allows us to visualize the flow of information through the network and identify the causal path that led to the output of the LLM.
This method is useful for debugging and tracking information that is contained in the model, but it is very involved and requires sufficient technical expertise to be useful. It is not practical for most users to perform this analysis on a regular basis to verify the validity of the output, much less to perform it on a large scale in a production environment.
Chained reasoning and action
More complex tasks often require reasoning and action to be performed over multiple steps in a specific order to achieve the desired outcome.
Large Language Models are Zero-Shot Reasoners proposes a zero-shot version of the chain of thought (COT) technique that involves chaining prompts together in order to allow LLMs to reason and act in a coordinated multi-step process.
Token windows represent the amount of context that an LLM has access to for a given task (e.g. 8K or 32K tokens for GPT-4). By chaining reasoning and action, we can increase the amount of context that the LLM has access to for a given task, and we can manage the memory that is within the LLM's token window to ensure that the LLM has access to the correct information at the correct time.
Complex data transformation use cases often require chained reasoning and multiple action prompts. This allows the LLM to break down large tasks into a series of smaller steps that can be performed in a specific order.
For example, to schedule a meeting, the LLM must first determine the date and time of the meeting, then it must identify the members of the meeting, and then schedule the meeting.
Applications like Google Calendar and Outlook provide APIs that can be used to schedule meetings. The LLM must make API calls to the calendar application, which requires the LLM to perform actions in a specific order. The LLM must first request a token that can be used to authenticate the API call, send a request to the API, and then parse the response to determine what action to take next.
The following papers are also relevant to this topic:
Title | Summary |
---|---|
ReAct: Synergizing Reasoning and Acting in Language Models | LLMs can generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two. |
Language Models are Few-Shot Learners | Scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. |
Toolformer: Language Models Can Teach Themselves to Use Tools | Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale. |
Interaction types
The following interactions are fundamental to the human experience, and should be considered when designing the standard for AI-Human interaction.
Note: This section is a very early draft of interaction types and is subject to change as the standard is developed. This list is not intended to be a comprehensive list of all possible interactions, but rather a starting point for discussion.
- Interactions are not mutually exclusive, and they can be combined to create more complex interactions.
AGENT | |
---|---|
Acknowledge | Agent should be able to acknowledge the user's request and provide a confirmation that the request was received. |
Approve | Agent should know when to ask for approval from the user, and the user should be able to easily understand the information being presented to them. |
Reject | Agent should know when to reject a request from the user, and the user should be able to easily understand why the request was rejected. |
Accept | Agent should know when to accept a request from the user, and the user should be able to easily understand why the request was accepted. |
Confirm | Agent should be able to confirm the user's request and provide a confirmation that the request was completed. |
Request | Agent should be able to request additional information from the user when needed. |
Explain | Agent should be able to retroactively explain why it took the action it did. |
Clarify | Agent should be able to clarify ambiguous information when needed. |
Resolve | Agent should be able to resolve conflicts when needed. |
Reply | Agent should be able to reply to the user when needed. |
Notify | Agent should be able to notify the user when needed. |
Warn | Agent should be able to warn the user when needed. |
Alert | Agent should be able to alert the user when needed. |
HUMAN | |
---|---|
Override | Humans should be able to override the Agent's decision-making process as needed. |
Revert | Humans should be able to revert the Agent's actions as needed. |
Recover | Humans should be able to recover from errors as needed. |
Retry | Humans should be able to retry failed actions as needed. |
Cancel | Humans should be able to cancel pending actions as needed. |
Terminate | Humans should be able to terminate the Agent as needed. |
Restart | Humans should be able to restart the Agent as needed. |
Resume | Humans should be able to resume the Agent as needed. |
Suspend | Humans should be able to suspend the Agent as needed. |
Pause | Humans should be able to pause the Agent as needed. |
SYSTEM | |
---|---|
Audit | All actions performed by the Agent should be logged and auditable without exception. |
Debug | All actions performed by the Agent should be debuggable without exception. |
Adapting legacy protocols
To establish effective AI-Human interaction, it is crucial to adapt existing protocols to maximize adoption and minimize friction.
It is difficult to represent information in a way that is optimally human-readable and machine-parsable. Since the goal of AI-Human interaction is to allow humans and LLMs to work together to solve problems, the protocol should be designed to allow humans to interact with the LLM in a way that is as natural as possible.
Inspiration can be drawn from existing protocols that already have a high degree of adoption and are capable of being extended to support AI-Human interaction.
-
Web-level protocols demonstrate robust behavior: HTTP is designed to be resilient to failure, and it is capable of recovering from errors and continuing to function.
-
In the same way, protocols for LLMs and intelligent agents should be designed to be resilient to failure, and they should be capable of helping agents to recover from errors and continue to function.
Protocol | |
---|---|
HTML | The Hypertext Markup Language (HTML) is the standard markup language for creating web pages and web applications. |
HTTP | The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed, collaborative, hypermedia information systems. |
TCP | The Transmission Control Protocol (TCP) is one of the main protocols of the Internet protocol suite. It originated in the initial network implementation in which it complemented the Internet Protocol (IP). |
IP | The Internet Protocol (IP) is the principal communications protocol in the Internet protocol suite for relaying datagrams across network boundaries. Its routing function enables internetworking and essentially establishes the Internet. |
DNS | The Domain Name System (DNS) is a hierarchical and decentralized naming system for computers, services, or any resource connected to the Internet or a private network. It associates various information with domain names assigned to each of the participating entities. |
SMTP | The Simple Mail Transfer Protocol (SMTP) is an Internet standard for electronic mail (email) transmission. |
TLS | Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL), are cryptographic protocols that provide communications security over a computer network. |
SSH | Secure Shell (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. |
FTP | The File Transfer Protocol (FTP) is a standard network protocol used for the transfer of computer files between a client and server on a computer network. |
Feedback
This page represents an informal and incomplete draft of the AI-Human Interaction Standard (AHIS).
-
This project aims to develop a generalized protocol for task-oriented interactions between humans and language models at the application level and to provide a set of guidelines for the design of human-machine interfaces that are optimized for collaboration between humans and LLMs.
-
The AI-Human Interaction Standard (AHIS) proposed here is an initial draft and is open for feedback, suggestions, and collaborative refinement from the broader community. We acknowledge that the success and effectiveness of this protocol heavily rely on collective input and perspectives from experts, practitioners, and end-users.
-
We encourage contributions to help shape and define a more robust and comprehensive standard, ultimately leading to an enhanced collaboration protocol for humans and AI systems. Please note that the current version of AHIS should be considered as a foundation for further development and improvement, and not as a finalized standard.