The Emerging AI Agent Protocol Stack

From Chatbots to an Internet of Agents

Jun 21, 2026

As a follow-up to my earlier note (AI Software Engineering: current shifts and constraints), I wanted to map the agent-protocol ecosystem as it stands in June 2026. The public narrative often sounds settled: this vendor leads, that protocol is the standard, this product solves the whole problem. The reality is more fragmented and more interesting.

Context

The software industry is now working toward a world in which AI agents can discover capabilities, use tools, delegate tasks to other agents, transact with businesses, and carry context across systems. Proponents sometimes refer to this emerging environment as the Internet of Agents, the Agentic Web, or the Agent Economy.

The analogy to the Internet is appealing, but it needs qualification. We do not yet have one coherent Internet of agents. We have a rapidly growing collection of protocols, schemas, registries, security frameworks, browser APIs, and domain-specific experiments. Some are moving faster than others. Others remain early specifications or newly formed community groups.

The best way to understand this landscape is not by memorizing project names. It is by examining the problems that must be solved before independently built agents can work together.

Internet of Agents, Agentic Web, Agent Economy conceptual illustration.

Current landscape

Because the ecosystem is crowded and the public conversation is noisy, a useful way to understand it is to organize today's initiatives around six questions interoperable agent systems must answer:

Tools and capabilities access: How do agents access and use tools and data?
Agent collaboration: How do agents delegate and coordinate work?
Discovery and description: How does one agent find and understand another?
Trust and governance: Should an agent be allowed to act, and under whose authority?
Domain transactions: How do agents perform consequential actions and manage domain-specific operations?
Memory and state: What should an agent remember, who controls that memory, and how should memories be managed and shared?

These are not six fixed layers of a universally agreed stack. They are six problem domains that today's initiatives approach from different angles.

Emerging agent ecosystem: six domains interoperable agent systems must address.

Some initiatives fit mostly inside one domain. Others, such as AGNTCY, span several layers at once. That overlap is part of the point: the emerging agent ecosystem is not a clean stack yet, but a set of partially overlapping attempts to solve adjacent interoperability problems.

1. Tools and capabilities access

How do agents access and use tools and data?

An agent becomes practically useful when it can act beyond the model that powers it. It may need to search an internal knowledge base, query a database, inspect a file, update a calendar, invoke an enterprise API, or execute a development tool. Historically, each application needed custom code for every such connection.

1a. Model Context Protocol (MCP)

The most prominent attempt to standardize this connection layer is the Model Context Protocol⁠, or MCP. MCP defines a common way for an AI application to connect to external systems. An MCP server can expose three principal kinds of capabilities:

Resources, such as documents, records, or application data;
Tools, which perform operations or invoke external systems;
Prompts, which provide reusable interaction templates.

MCP is frequently called the “USB-C of AI” because it gives many tools and data sources a shared interface. The analogy is useful: rather than creating a separate connector for every AI application and service, developers can implement a shared interface. However, the analogy is also incomplete. MCP is not just a plug shape, and it does not automatically make an integration safe.

Its architecture includes lifecycle management, capability negotiation, transports, authorization mechanisms, and defined client-server responsibilities. Implementers must still decide which tools an agent may access, which user or organization it represents, and whether a particular action should require approval.

A better mental model is: MCP standardizes how an AI application can inspect and invoke capabilities exposed by another system.

That system may be a local program, a remote service, a data source, or even something implemented internally by another agent. The distinction between “tool” and “agent” is sometimes architectural rather than absolute.

1b. WebMCP

The proposed draft WebMCP API⁠ (Google, Microsoft) brings a related idea into the browser. A web application can already perform useful operations through its JavaScript code: search a catalog, compose a message, modify a document, submit a form, or schedule an appointment. Yet a browser agent often has to infer how to accomplish those operations by interpreting and manipulating the visible user interface, as in Anthropic’s Computer Tool Use or OpenAI's Operator.

WebMCP proposes a JavaScript interface that allows a page to expose those operations as structured tools with names, descriptions, and schemas. Instead of guessing which button to click (see an example, and more here), an agent could receive a machine-readable declaration such as:

search_inventory
add_item_to_cart
compare_products
schedule_appointment

This approach could make browser automation more reliable and accessible than coordinate-based clicking or repeated interpretation of page layouts.

WebMCP should still be described carefully. It is an emerging web API proposal, not yet a universally implemented browser capability. Its importance lies in the architectural direction it represents: websites may eventually serve both human interfaces and explicit agent-facing interfaces.

2. Agent collaboration

How do agents delegate and coordinate work?

Giving an agent access to a tool is not always equivalent to enabling it to collaborate with another autonomous system. A tool generally exposes a bounded operation: retrieving a record, calculating a route, or submitting a form. An agent may instead accept a goal, ask questions, negotiate constraints, delegate subtasks, produce intermediate results, and continue working over an extended period.

Making this possible requires at least two related layers:

Collaboration semantics, defining tasks, messages, artifacts, and delegation;
Communication infrastructure, securely transporting those interactions across machines, networks, and organizational boundaries.

2a. Agent2Agent (A2A) Protocol

The Agent2Agent Protocol⁠ (A2A), originally developed by Google and later donated to the Linux Foundation, defines a standard interaction model for independently implemented agents. Its purpose is to let agents communicate across differences in vendor, programming language, deployment environment, and orchestration framework.

An A2A agent can publish an Agent Card describing information such as:

its identity and provider;
its service endpoint;
the skills it offers;
supported input and output modes;
authentication requirements;
capabilities such as streaming and notifications.

A client agent can then send messages, delegate tasks, receive artifacts, monitor progress, and interact with operations that may take longer than a conventional API request.

Consider a travel-planning assistant. It might use:

an airline agent to negotiate flight options;
a hotel agent to evaluate accommodations;
an enterprise travel-policy agent to check compliance;
a payment agent to complete an authorized transaction.

These participants may be operated by different organizations and may not reveal their internal prompts, models, tools, or proprietary workflows. A2A is designed to let them collaborate while remaining operationally opaque to one another.

A useful distinction is: MCP gives an agent access to capabilities. A2A lets independently operated agents collaborate as agents.

The two can be used together. An agent may use MCP internally to access its tools while exposing its broader services to other agents through A2A.⁠

2b. AGNTCY Collaboration and SLIM

Cisco originally developed the AGNTCY⁠ project and later contributed it to the Linux Foundation, addressing agent collaboration as part of a broader open-source infrastructure stack. A central component of this work is SLIM (Secure Low-Latency Interactive Messaging). Whereas A2A defines agent-level concepts such as messages, tasks, delegation, and coordination, SLIM focuses on the underlying communication substrate that securely supports those interactions.

SLIM provides capabilities such as:

secure message transport;
message routing;
communication across network boundaries;
group communication for multi-agent systems;
end-to-end encryption;
support for distributed agent deployments.

The relationship can be summarized as: A2A defines what agents say and how collaborative tasks behave; SLIM provides secure infrastructure for transporting those interactions.

SLIM is therefore not simply an alternative to A2A. It can operate beneath application-level protocols such as A2A, allowing them to focus on agent semantics rather than networking, routing, and transport security. This distinction resembles conventional Internet architecture. An application protocol defines the meaning and structure of an interaction, while lower-level infrastructure determines how information is delivered between participants.

AGNTCY extends beyond SLIM. Its broader ecosystem also includes components for agent discovery, identity, description, and observability. For that reason, AGNTCY spans several problem domains in the emerging agent stack rather than fitting neatly into a single category.⁠⁠⁠

3. Discovery and description

How does one agent find and understand another?

Communication protocols are useful only after participants know where to connect and what the other party can do. The conventional web relies on several distinct mechanisms: DNS resolves names, search engines index content, URLs identify resources, schemas describe data, and certificates help authenticate endpoints. Agent ecosystems will likely need similarly distinct mechanisms rather than one universal “agent directory”.

3a. AGNTCY

AGNTCY⁠ appears again here because its stack also covers discovery and description. Rather than being one protocol, AGNTCY is better understood as a stack of complementary components for agent interoperability. Its work encompasses areas such as:

agent description;
discovery and directories;
identity;
secure messaging;
observability.

One of its components is the Open Agentic Schema Framework, or OASF.

3b. Open Agentic Schema Framework (OASF)

OASF⁠ defines an extensible model for describing agents and their attributes. This matters because names and prose descriptions alone are insufficient for machine discovery. Another system may need to determine:

what skills an agent possesses;
what domains it supports;
which protocols it speaks;
what authentication it requires;
who operates it;
which version is deployed;
where it can be reached.

OASF aims to make those attributes machine-readable and extensible. It can describe various agent-facing entities, including A2A agents and MCP servers.

There is some natural overlap here. A2A already defines Agent Cards, while MCP has developed registry mechanisms for MCP servers. OASF seeks to provide a broader schema that can represent multiple ecosystems.

That overlap is not necessarily a flaw. Internet infrastructure has always contained overlapping layers and representations. The unanswered question is which descriptions and registries will become widely adopted and how they will interoperate without producing yet another set of translation gateways.⁠

3c. W3C AI Agent Protocol Community Group

The W3C AI Agent Protocol Community Group⁠ is exploring foundations for an Agentic Web, including agent identification, discovery, and collaboration. The phrase “W3C group” can give an initiative a greater sense of maturity than it actually has. W3C Community Groups are environments for incubation and collaboration. Their outputs are not automatically official W3C Recommendations.

The group is still significant because agent interoperability increasingly touches the architecture of the open web. Decisions about identity, discovery, delegation, privacy, and machine-readable capabilities should not be made exclusively inside individual AI platforms.

4. Trust and governance

Should an agent be allowed to act, and under whose authority?

Connectivity is not trust. An agent may be technically capable of calling a tool or contacting another agent even if not authorized to perform a particular action. It may also be compromised, manipulated by untrusted content, granted excessive privileges, or confused about the user’s intent. This is where the hardest problems begin.

A production agent ecosystem must establish at least:

which agent is acting;
which organization operates it;
which human or service it represents;
what authority has been delegated;
how narrowly that authority is scoped;
whether the request has been altered;
what evidence is retained;
who is accountable when something goes wrong.

No single initiative currently solves this entire trust stack.

4a. OWASP Agentic Security Initiative

The OWASP GenAI Security Project⁠ approaches the problem from the perspective of threats, secure engineering, and defensive guidance. Its Agentic Security Initiative has developed material addressing risks that become especially important when systems can plan and act, including:

agent goal hijacking;
tool misuse;
identity and privilege abuse;
memory poisoning;
insecure inter-agent communication;
cascading failures;
exploitation of misplaced trust;
rogue or compromised agents.

The OWASP Top 10 for Agentic Applications 2026 provides a prioritized framework for understanding these risks. OWASP also publishes practical guidance for designing and deploying secure agentic applications. OWASP does not certify that an agent is safe. It helps builders, security teams, and organizations understand what can go wrong and what controls they should consider.⁠

4b. AIUC-1

AIUC-1⁠ approaches trust from an assurance and certification perspective. It defines requirements across six broad areas:

security;
safety;
reliability;
accountability;
data and privacy;
societal considerations.

The standard combines organizational controls with technical evaluation and is intended to help enterprises evaluate or certify agentic systems.

AIUC-1 is sometimes called “SOC 2 for AI agents”. That phrase communicates the basic aspiration: an auditable signal that an agent meets defined controls. It should not be interpreted literally. AIUC-1 is not SOC 2 (System and Organization Controls), and the governance, issuing process, market history, and underlying assurance models differ.

It is also important to understand the institutional model: official AIUC-1 certificates are issued by the Artificial Intelligence Underwriting Company, working with accredited auditors and supporting providers.

AIUC-1 and OWASP are therefore complementary rather than interchangeable:

OWASP helps organizations identify threats and design defenses.
AIUC-1 defines auditable requirements and a certification mechanism.
Broader frameworks such as ISO/IEC 42001 address organizational AI-management systems.

A mature enterprise will likely need elements from all three categories: threat modeling, operational controls, and independent assurance.⁠

5. Domain transactions

How do agents perform consequential actions and manage domain-specific operations?

General interoperability protocols cannot encode every industry’s legal, operational, and economic requirements. Healthcare, banking, logistics, procurement, travel, and insurance all have domain-specific concepts and trust relationships. As agents enter these environments, specialized protocols will be needed above the general communication layer.

Agent Payments Protocol

The Agent Payments Protocol⁠, or AP2, focuses on commerce conducted by agents. I am referring here to v0.2 of the draft. Traditional online payments assume that a human is present at a website or application and intentionally presses a button to authorize a purchase. Autonomous agents break that assumption.

Suppose someone instructs an agent: “buy two economy-class tickets to Paris, provided the total price is below $2,000, the trip is refundable, and there is no overnight connection”. The user may no longer be present when a qualifying offer appears. A merchant, payment processor, or card issuer needs evidence that:

the user authorized the agent;
the proposed purchase satisfies the stated constraints;
the authority has not expired or been altered;
the resulting transaction can be audited;
responsibility can be assigned if something goes wrong.

In the current draft, AP2 introduces cryptographically verifiable mandates representing user intent and delegated authority. Its flows distinguish cases in which the human is present from cases in which an agent acts autonomously after receiving prior authorization.

AP2 is designed as a specialized transactional layer. It can work alongside general infrastructure such as A2A and MCP, but it solves a different problem: proving that an agent is authorized to perform a particular commercial action.

This pattern will likely repeat elsewhere. The agent ecosystem may eventually contain protocols for insurance claims, medical consent, enterprise procurement, supply-chain commitments, and regulated financial instructions.⁠

6. Memory and state

What should an agent remember, who controls that memory, and how should memories be managed and shared?

Memory is often discussed as though it were merely a database feature, a product setting, or even just a Markdown file. In an interoperable agent ecosystem, it becomes a much broader architectural and governance problem. An agent may retain:

user preferences;
previous conversations;
decisions and commitments;
task history;
learned procedures;
summaries of external information;
relationships among people and organizations;
records of delegated authority.

Moving that information between systems raises difficult questions:

Is the memory a verbatim record, a model-generated summary, or a learned representation?
Who owns it?
Which agent wrote it?
What evidence supports it?
Can the user inspect, correct, export, or delete it?
How are conflicting memories reconciled?
Which parts may be disclosed to another agent?
How do retention policies and privacy laws apply?
Can malicious content poison long-term memory?

Labs such as Anthropic, OpenAI, and DeepMind, along with startups and established companies, are already taking different approaches to memory across APIs, products, agents, sessions, and end-user interactions. The definitions, standards, and practices are still early, as shown by examples such as Anthropic Agent’s memory, OpenAI Agents SDK Agent Memory.

W3C AI Agent Memory Interoperability Community Group

The W3C AI Agent Memory Interoperability Community Group⁠ was proposed in May 2026 and active by June 2026 to explore an open protocol-level specification for portable agent memory across vendors, models, frameworks, and tool ecosystems. This initiative is extremely early. It is the beginning of a standardization discussion, not an existing portability standard that applications can already depend upon. Its formation is still revealing.

Agent memory is becoming too important to remain an opaque, vendor-specific implementation detail. If personal and organizational agents are expected to persist for years, memory portability may become as consequential as data portability is for today’s cloud services.⁠

Closing comments

These projects are not all at the same level. One reason the landscape feels confusing is that the projects differ along several dimensions. Some define wire protocols. Others define schemas, browser APIs, registries, security guidance, certification requirements, or domain-specific authorization models.

They also have different maturity levels. “Open” has several meanings. A specification may be publicly readable, openly governed, open-source, royalty-free, community-developed, or controlled by one organization while accepting outside participation. Those properties should not be treated as equivalent.

The emerging stack

Viewed together, these initiatives suggest a possible architecture:

An agent uses MCP or native APIs to access tools, data, and application capabilities.
It exposes its collaborative capabilities through A2A or another agent-interaction protocol.
Schemas and directories describe what the agent can do and help other systems find it.
Identity, authorization, and credential mechanisms establish who the agent represents and what it may do.
Security frameworks help builders defend the system, while assurance programs evaluate whether controls are operating.
Domain protocols add the rules required for payments and other consequential transactions.
Memory protocols may eventually let state move across agents without surrendering user control or provenance.
Observability systems record what happened across the entire chain.

The architecture is plausible. It is not yet complete. Important gaps remain around identity federation, revocation, policy negotiation, reputation, liability, audit semantics, observability, version compatibility, dispute resolution, and the boundary between human and agent consent.

An illustration of the current building blocks of the emerging AI agent protocol stack.

What would make this a real Internet of agents?

The original Internet succeeded because its protocols enabled independently operated networks and applications to interconnect without requiring a single company to control the entire system. An Internet of agents would need similar properties:

multiple interoperable implementations;
portable identities and capabilities;
decentralized or federated discovery;
explicit, revocable delegation;
secure operation across organizational boundaries;
meaningful user control;
observable and auditable actions;
graceful version evolution;
resistance to capture by a single platform.

The greatest risk is not that the industry fails to produce enough protocols. It is that it produces too many partially overlapping protocols while leaving identity, authority, and accountability underspecified.

Connecting agents is comparatively easy. Establishing whether an agent should be trusted, what it is authorized to do, and who bears responsibility for its actions is much harder.

That is the transition now underway. We are not merely teaching chatbots to call more APIs. We are beginning to define the technical, economic, and institutional rules through which software agents may act as participants in the digital world. The Internet of agents does not yet exist as a unified system, but its layers are beginning to take shape.

Lucas Müller Notes

Discussion about this post

Ready for more?