<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-cui-ai-agent-task-01"
  ipr="trust200902"
  obsoletes=""
  updates=""
  submissionType="IETF"
  xml:lang="en"
  version="3">

  <front>
    <title abbrev="Agent Task Coordination">Task-oriented Coordination Requirements for AI Agent Protocols</title>
    <seriesInfo name="Internet-Draft" value="draft-cui-ai-agent-task-01"/>

    <author initials="Y." surname="Cui" fullname="Yong Cui">
      <organization>Tsinghua University</organization>
      <address>
        <postal>
          <region>Beijing</region>
          <code>100084</code>
          <country>China</country>
        </postal>
        <email>cuiyong@tsinghua.edu.cn</email>
        <uri>http://www.cuiyong.net/</uri>
      </address>
    </author>
    
    <author initials="C." surname="Du" fullname="Chenguang Du">
      <organization>Zhongguancun Laboratory</organization>
      <address>
        <postal>
          <region>Beijing</region>
          <code>100094</code>
          <country>China</country>
        </postal>
        <email>ducg@zgclab.edu.cn</email>
      </address>
    </author>
   
    <date year="2026" month="July" day="3"/>

    <area>General</area>
    <workgroup>Network Working Group</workgroup>
    
    <keyword>AI agent</keyword>
    <keyword>Agent communication</keyword>
    <keyword>Task coordination</keyword>

    <abstract>
      <t>AI agent communication requires intelligent task level coordination to manage dynamic workloads across large-scale, heterogeneous networking environments. This draft proposes general requirements for an agent protocol to enable autonomous task coordination at scale, including dynamic task discovery, negotiation, and context-aware scheduling with real-time adaptability. </t>
    </abstract>
 
  </front>

  <middle>
    

    <section>
      <name>Introduction</name>
      <section>
        <name>Purpose</name>
        <t>With the rapid advancements of AI technologies and their applications, AI agents utilizing Large Language Models (LLMs) have emerged as a pivotal direction in global technological evolution and market development. The single-agent systems exhibit inherent limitations when addressing complex tasks in dynamic environments, the efficient multi-agent collaboration for complex task completion has garnered increasing attention, wherein task-oriented coordination constitutes a critical component of standardized multi-agent systems.</t>
        <t>This document examines the requirements and operations for standardizing AI Agent protocols to support task coordination in multi-agent systems.</t>
      </section>
      
      <section>
        <name>Terminology</name>
        <dl newline="true">
          <dt>Task:</dt>
          <dd>ISO/IEC 22989, task is actions required to achieve a specific goal. These actions can be physical or cognitive. For instance, computing or creation of predictions, translations, synthetic data or artefacts or navigating through a physical space.</dd>
          <dt>Shared Message Pool:</dt>
          <dd>A pool where agents publish structured messages and subscribe to relevant messages based on their profiles.</dd>
          <dt>Coordinator Agent:</dt>
          <dd>An agent that receives tasks and decomposes or distributes tasks to other agents.</dd>
          <dt>Execution Agent:</dt>
          <dd>An agent responsible for executing tasks distributed by the Coordinator Agent.</dd>
          <dt>Agent Communication Server:</dt>
          <dd>The server enables Agents to communicate and collaborate with each other, which provides session management, routing function, etc.</dd>
          <dt>Normative Language:</dt>
          <dd>The key words "MUST",  "REQUIRED", and "SHOULD" in this document are to be interpreted as described in <xref target="RFC2119"/>.</dd>
        </dl>
      </section>
    <section>
    <name>Task Coordination Framework</name>
<figure title="Task Coordination Framework" anchor="fig-frame"><artwork><![CDATA[
                                                          +---------+
                                     +------------------->| Agent X |
                                     |2.Task1 distributed +---------+
                                     |                        |     
                                     |                        |  
                                     |                        |
  +---------+  1.Task submitted  +---------+ 3.Task1 completed|
  |   Task  |------------------->|         |<-----------------+      
  | Invoker |<------------------ | Agent A |<-----------------+
  +---------+  4.Task completed  +---------+ 3.Task2 completed|
                                     |                        |     
                                     |                        |  
                                     |                        |
                                     |2.Task2 distributed +---------+
                                     +------------------->| Agent Y |
                                                          +---------+
                                                 
]]></artwork></figure>
    <t>The system operates as follows: when a task invoker submits a task to Agent A (Coordinator), the Agent A SHOULD use the standard discovery mechanism to discover Agent X and Agent Y that own the required capabilities and assign tasks to them. Upon receiving completion notifications from both agents, Agent A aggregates the results and delivers the final artifact back to the task invoker.</t>
    </section>
    </section>
      
    <section>
      <name>Use Cases</name>
      <t>Some typical use cases in which multiple agents work together to complete tasks:</t>
      <ul spacing="normal">
        <li>High throughput tasks: There are many tasks that have high bandwidth requirements. For example, in the collaborative framework for coordinating heterogeneous embodied agents-specifically, the robot dogs and drones-in a wide-area public network, the drone is assigned for wide-area surveillance and task delegation, while functionally specialized robot dogs perform ground-level operations such as video surveillance, material transport and obstacle clearance.</li>
        <li>Low latency tasks: In collaborative multi-agent systems, control signal transmission tasks impose significantly more stringent latency requirements than routine model training data transfer. For example, the home robot remotely sends an alarm message to the end user.</li>
        <li>High reliability tasks: Smart factory scenarios require critical reliability of agent task execution and fault-tolerant operation stability.</li>
      </ul>
      <t>These categories of use cases (may be further extended) demonstrate the collaboration among agents spanning multiple distinct domains to achieve end-to-end task completion. The embodied agents (such as the robots and unmanned aerial vehicles) interacting with physical environments through embodied interfaces, while virtual agents (such as the various software applications and personal assistant) providing complementary capabilities, has demonstrated the advantages of collaboratively completing complex tasks in various scenarios.</t>
    </section>
      
    <section>
      <name>Necessity</name>
      <section>
        <name>Task Complexity</name>
        <t>As task complexity increases, heterogeneous agents require multiple interaction rounds, precise planning, ordered execution, and efficient context sharing mechanisms to enhance resolution quality and robustness.</t>
      </section>
      <section>
        <name>Resource Optimization</name>
        <t>Through task coordination and resource consumption monitoring, the multi-agent systems are able to support dynamic allocation of for example, the computing, storage and bandwidth resources to optimize the resource utilization efficiency.</t>
      </section>
      <section>
        <name>Quality of Service</name>
        <t>Task coordination may dynamically prioritize resources allocations based on for example, task priorities, agent expertise and Quality of Service requirements. This ensures timeliness and accuracy of critical tasks, reduces service response latency, and maintains output stability and reliability.</t>
      </section>
      <section>
        <name>Dynamic Adjustment</name>
        <t>The agents may update or adjust the task during task execution phase based on end user's inputs or contextual updates to better respond to the final task requirements.</t>
      </section>
    </section>
    
    <section>
      <name>Protocol Requirements</name>
      <section>
        <name>Task Description</name>
        <t>Precise task descriptions or task templates are REQUIRED to ensure all agents maintain a consistent understanding of the objectives, operational constraints and criteria.</t>
        <t>A well-defined task description:</t>
        <ul spacing="normal">
          <li>Reduces ambiguity: Minimizes misinterpretations and conflicting actions among agents.</li>
          <li>Enables verifiability: Translates abstract goals into executable and measurable plans.</li>
          <li>Improves robustness: Ensures collaboration remains coherent and efficient under dynamic conditions.</li>
        </ul>
        <t>The AI Agent protocol should support a structured task description format. A complete task description may include the following components:</t>
        <ul spacing="normal">
          <li>Metadata: task identifier, task type (e.g., execution, monitoring), task priority, creation time, expected completion time, task initiator, etc.</li>
          <li>Task objective: the objective described in natural language.</li>
          <li>Task execution constraints: the constraints include resource limitations, safety requirements, etc.</li>
          <li>Task artifact specifications: the expected output format and the desired artifact list.</li>
        </ul>
        <t>Task descriptions assigned to different agents MUST follow the minimization principle, i.e., agents SHOULD receive only the minimal, contextually necessary information required to fulfill their tasks to prevent unauthorized access of sensitive information.</t>
      </section>
      <section>
        <name>Task State</name>
        <t>The AI Agent protocol design should support comprehensive state descriptions throughout the task execution lifecycle. The potential task states may include task submitted, running, suspended (awaiting external input or output from other agents), completed, canceled, rejected, and failed, as well as management operations such as state queries, retrieval, and pushing intermediate results.</t>
        <t>Based on the length of time to complete the tasks, tasks can be categorized into short-term tasks that require a single request-response interaction and long-term tasks that may require multi-round interactions or extended waiting periods. For long-term tasks, the reporting and querying of intermediate states (e.g., progress percentage, checkpoints) should be supported. The Coordinator Agent may dynamically adjust the target of the task according to intermediate results from the Execution Agents and the context information. The AI Agent protocol design should support management of both long-term and short-term tasks.</t>
      </section>
      <section>
        <name>Context Sharing</name>
        <t>When delegating tasks to Execution Agents, the Coordinator Agent may include task-relevant contextual about the contact information of the end user, the task itself, the historical preference information known by the Coordinator Agent, and other necessary conversation data, to facilitate the task execution. For example, in trip planning case, this may encompass historically booked flight/hotel preferences or dynamically perceived context like recent user dialog. The AI Agent protocol design should consequently support context sharing mechanisms through standardized definitions of context types, length constraints, and encoding formats to enhance the effectiveness of task execution.</t>
        <t>The context sharing MAY have an impact on privacy of the user, it is necessary to consider the limitations of the scope of context sharing, especially for the sensitive information e.g. name, age, address of the user.</t>
      </section>
      <section>
        <name>Exception Handling</name>
        <t>Exception handling constitutes a critical mechanism for multi-agent collaborative task execution. If an execution agent cannot complete an assigned task due to lack of skills or overloaded, the failure in task execution may lead to such as releasing the connections.</t>
      </section>
    </section>

    <section>
      <name>The Requirements on the Agent Session Protocol</name>
      <t>The AI Agent protocol should consider separating the transmission of task control messages (such as task creation, task update, task status query and task result notification) from the transmission of real-time multimodal context (such as task-related voice, video, and images). These two different types of messages may require different transmission channels. This AI Agent protocol can be built on top of the Agent session protocol and make use of the Agent session protocol to enable the exchange of task control messages, real-time and non-real-time multimodal context between AI agents with low-latency.</t>
      <t>The AI Agent session protocol should be able to provide mechanisms beyond simple request/response, including the complex interaction modes for example message multicast, publication/subscription (PUB/SUB), asynchronous notifications.</t>
    </section>

    <section>
      <name>Task Flow</name>
      <section>
        <name>Task Operation</name>
        <t>After receiving the target agent list from the discovery result, the Coordinator Agent may select one Execution Agent and send the task cooperation message to that agent. The Coordinator Agent may delegate agent identifier lookup to the Agent Communication Server, which is then responsible for routing the message according to the agent identifier.</t>
        <t>The Agent Communication Server MAY also provide task state management according to service requirements, for example, retry/re-entry after task failure.</t>
        <figure title="Agent Communication Flow in the Same Domain" anchor="fig-task-flow-same-domain"><artwork><![CDATA[
+-----+                   +-------------+                   +-----+
|     |-(A)task request-->|             |                   |     |
| AI  |                   |    Agent    |-(B)task request-->| AI  |
|Agent|                   |Communication|                   |Agent|
|  A  |                   |   Server    |<-(C)task response-|  B  |
|     |<-(D)task response-|             |                   |     |
+-----+                   +-------------+                   +-----+
]]></artwork></figure>
        <t><xref target="fig-task-flow-same-domain"/> shows the abstract task cooperation procedure between agent A and agent B in the same domain:</t>
        <dl newline="true">
          <dt>(A)</dt>
          <dd>The AI Agent A sends a task request to AI Agent B via the Agent Communication Server.</dd>
          <dt>(B)</dt>
          <dd>The Agent Communication Server verifies the request message and routes the request to AI Agent B.</dd>
          <dt>(C)</dt>
          <dd>The AI Agent B receives the task request and sends a response to the Agent Communication Server.</dd>
          <dt>(D)</dt>
          <dd>The Agent Communication Server transfers the response to AI Agent A.</dd>
        </dl>
        <figure title="Agent Communication Flow across Domains" anchor="fig-task-flow-cross-domain"><artwork><![CDATA[
+------+                   +-------------+
|      |-(A)task request-->|             |
|  AI  |                   |    Agent    |
| Agent|                   |Communication|
|   A  |                   |   Server X  |
|      |<-(F)task response-|             |
+------+                   +-------------+
                           |       ^
                           |       |
                (B)task request    (E)task response
                           |       |
                           v       |
+------+                   +-------------+
|      |<-(C)task request--|             |
|  AI  |                   |    Agent    |
| Agent|                   |Communication|
|   B  |                   |   Server Y  |
|      |-(D)task response->|             |
+------+                   +-------------+
]]></artwork></figure>
        <t><xref target="fig-task-flow-cross-domain"/> shows the abstract task cooperation procedure between agent A and agent B across domains:</t>
        <dl newline="true">
          <dt>(A)</dt>
          <dd>The AI Agent A sends a task request to AI Agent B via Agent Communication Server X.</dd>
          <dt>(B)</dt>
          <dd>Agent Communication Server X verifies the request, parses agent B's identifier, obtains the routing address of Agent Communication Server Y, establishes a message channel with Server Y, and routes the request to Server Y.</dd>
          <dt>(C)</dt>
          <dd>Agent Communication Server Y receives and verifies the task request message, then routes the request to agent B.</dd>
          <dt>(D)</dt>
          <dd>AI Agent B receives the task request and sends a response to Agent Communication Server Y.</dd>
          <dt>(E)</dt>
          <dd>Agent Communication Server Y transfers the response to Agent Communication Server X.</dd>
          <dt>(F)</dt>
          <dd>Agent Communication Server X transfers the response to AI Agent A.</dd>
        </dl>
      </section>
      <section>
        <name>Task Parameters</name>
        <t>The following parameters are included in the agent task cooperation request:</t>
        <dl newline="true">
          <dt>Message Name</dt>
          <dd>REQUIRED. It specifies the message name.</dd>
          <dt>Agent Identifier</dt>
          <dd>REQUIRED. It specifies the identifier of the Execution Agent.</dd>
          <dt>Task Description</dt>
          <dd>REQUIRED. It specifies the task description that needs to be completed by the Execution Agent, as defined in Section 4.1.</dd>
          <dt>Input</dt>
          <dd>REQUIRED. It specifies the task input parameters; the input can be text, file, image, and similar modalities.</dd>
          <dt>Context</dt>
          <dd>OPTIONAL. It specifies context information as defined in Section 4.3.</dd>
        </dl>
        <t>The following parameters are included in the agent task cooperation response:</t>
        <dl newline="true">
          <dt>Task State</dt>
          <dd>REQUIRED. It specifies the task state as defined in Section 4.2.</dd>
          <dt>Output</dt>
          <dd>REQUIRED. It specifies the output result information of the task; the output can be text, file, image, and similar modalities.</dd>
        </dl>
      </section>
    </section>

    <section>
      <name>Conclusions</name>
      <t>Task-oriented coordination constitutes a critical function for multi-agent collaboration. This document discusses the necessity of introducing task-oriented coordination to address complex tasks, optimize resource utilization, and guarantee service quality. Consequently, it analyzes the requirements imposed by task-oriented coordination on AI Agent protocol design, specifically concerning task descriptions, task states, communication mechanisms, context sharing, and exception handling. Finally, it introduces the task flow and parameters for task cooperation between Agents.</t>
    </section>
    
    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>This memo includes no request to IANA.</t>
    </section>
    
    <section anchor="Security">
      <name>Security Considerations</name>
      <t>When designing the task-oriented coordination for AI agents communication, privacy should always be considered.</t>
    </section>
    
  </middle>

  <back>
    <references anchor="sec-combined-references">
      <name>References</name>
      <references anchor="sec-normative-references">
        <name>Normative References</name>
        <reference anchor="RFC2119">
          <front>
            <title>Key words for use in RFCs to Indicate Requirement Levels</title>
            <author fullname="S. Bradner" initials="S." surname="Bradner"/>
            <date month="March" year="1997"/>
            <abstract>
              <t>In many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.</t>
            </abstract>
          </front>
          <seriesInfo name="BCP" value="14"/>
          <seriesInfo name="RFC" value="2119"/>
          <seriesInfo name="DOI" value="10.17487/RFC2119"/>
        </reference>
      </references>
      <references anchor="sec-informative-references">
        <name>Informative References</name>
        <reference anchor="Multi-Agents">
          <front>
            <title>Large Language Model based Multi-Agents: A Survey of Progress and Challenges</title>
            <author initials="T." surname="Guo">               </author>
            <author initials="X." surname="Chen">              </author>
            <author initials="Y." surname="Wang">              </author>
            <author initials="R.i" surname="Chang">            </author>
            <author initials="S." surname="Pei">               </author>
            <author initials="N.V." surname="Chawla">          </author>
            <author initials="O." surname="Wiest">             </author>
            <author initials="X." surname="Zhang">             </author>
            <date year="2024"/>
          </front>
        </reference>      
        <reference anchor="MetaGPT">
          <front>
            <title>MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework</title>
            <author initials="S." surname="Hong">               </author>
            <author initials="M." surname="Zhuge">              </author>
            <author initials="J." surname="Chen">               </author>
            <author initials="X." surname="Zheng">              </author>
            <author initials="Y." surname="Cheng">              </author>
            <author initials="C." surname="Zhang">              </author>
            <author initials="J." surname="Wang">               </author>
            <author initials="Z." surname="Wang">               </author>
            <author initials="S." surname="Yau">                </author>
            <author initials="Z." surname="Lin">                </author>
            <author initials="L." surname="Zhou">               </author>
            <author initials="C." surname="Ran">                </author>
            <author initials="L." surname="Xiao">               </author>
            <author initials="C." surname="Wu">                 </author>
            <author initials="J." surname="Schmidhuber">        </author>
            <date year="2023"/>
          </front>
        </reference>    
      </references>
    </references>
    
 </back>
</rfc>
