Tool Failure Crashes Entire ADK Multi-Agent Workflow


### Discussed in https://github.com/google/adk-python/discussions/795

<div type='discussions-op-text'>

<sup>Originally posted by **nikolaidk** May 15, 2025</sup>
Maintainer's comment: we'd like to seek options on this topic from community.

check out https://github.com/google/adk-python/discussions/795#discussioncomment-13201633 for the poll and cast your opinions. 

---

Original content

# MCP Tool Failure Crashes Entire ADK Multi-Agent Workflow

When an MCP tool fails during execution (not connection), it propagates as an unhandled exception that crashes the entire ADK agent workflow, stopping all subsequent agents in a SequentialAgent pipeline.

Environment
ADK Version: Latest (using google.adk.agents, google.adk.tools.mcp_tool)
Python Version: 3.12
MCP Library Version: Latest compatible with current ADK implementation
Operating System: Linux 5.15
Problem Description
While ADK provides good error handling for MCP server connection failures, runtime MCP tool failures (like "Resource not found") propagate as unhandled McpError exceptions that crash the entire multi-agent workflow.

Expected Behavior
Individual MCP tool failures should not crash the entire agent workflow
Agents should be able to handle tool failures gracefully and continue execution
Sequential agents should continue to subsequent agents even if one tool fails
The framework should provide built-in resilience mechanisms for MCP tool failures
Actual Behavior
Single MCP tool failure crashes the entire SequentialAgent workflow
No opportunity for graceful degradation or alternative approaches
Complete loss of partial results from successful agents
Workflow stops executing without running subsequent agents
Steps to Reproduce
Create a multi-agent workflow using SequentialAgent
Include an MCP tool that may fail (e.g., GitHub file access with invalid path)
Configure the agent to use the MCP tool
Run the workflow with inputs that will cause the MCP tool to fail
Minimal Reproducible Example
python
import asyncio
from google.adk.agents import SequentialAgent, LlmAgent
from google.adk.tools.mcp_tool.mcp_toolset import MCPToolset
from google.adk.tools.mcp_tool.mcp_toolset import StdioServerParameters

async def create_failing_workflow():
    # Setup GitHub MCP tools
    git_tools, git_exit_stack = await MCPToolset.from_server(
        connection_params=StdioServerParameters(
            command='npx',
            args=["-y", "@modelcontextprotocol/server-github"],
            env={"GITHUB_PERSONAL_ACCESS_TOKEN": "your_token"}
        )
    )
    
    # Create agent that will use failing MCP tool
    failing_agent = LlmAgent(
        name="FailingAgent",
        model="gemini-2.5-pro-preview-05-06",
        instruction="Try to access a non-existent file from the repository",
        tools=git_tools
    )
    
    # Create workflow with subsequent agents
    workflow = SequentialAgent(
        name="TestWorkflow",
        sub_agents=[failing_agent, other_agent1, other_agent2]
    )
    
    return workflow, git_exit_stack

# Run with input that causes MCP tool to fail
# Result: Entire workflow crashes, other_agent1 and other_agent2 never execute
Error Log
mcp.shared.exceptions.McpError: Not Found: Resource not found: Not Found
  File "/home/nmr/.venv/lib/python3.12/site-packages/google/adk/tools/mcp_tool/mcp_tool.py", line 126, in run_async
    raise e
  File "/home/nmr/.venv/lib/python3.12/site-packages/google/adk/tools/mcp_tool/mcp_tool.py", line 122, in run_async
    response = await self.mcp_session.call_tool(self.name, arguments=args)
  File "/home/nmr/.venv/lib/python3.12/site-packages/mcp/client/session.py", line 265, in call_tool
    return await self.send_request(
  File "/home/nmr/.venv/lib/python3.12/site-packages/mcp/shared/session.py", line 273, in send_request
    raise McpError(response_or_error.error)
Current Workarounds
Agent Instruction Level: Explicitly instruct agents to handle tool failures
Wrapper Functions: Create wrapper tools with try-catch logic
Alternative Agent Patterns: Use custom agents instead of SequentialAgent
Suggested Solutions
1. Framework-Level Error Handling
Add built-in error handling in MCPTool.run_async():

python
async def run_async(self, args, tool_context):
    try:
        response = await self.mcp_session.call_tool(self.name, arguments=args)
        return response
    except McpError as e:
        # Convert to tool result with error information
        return {
            "error": True,
            "error_type": "mcp_tool_failure",
            "error_message": str(e),
            "tool_name": self.name,
            "suggestions": ["Try alternative tools", "Check connectivity"]
        }
2. SequentialAgent Resilience
Modify SequentialAgent to continue execution despite sub-agent failures:

python
# Add option for fault-tolerant execution
workflow = SequentialAgent(
    name="FaultTolerantWorkflow",
    sub_agents=[agent1, agent2, agent3],
    continue_on_failure=True,  # New parameter
    collect_partial_results=True  # New parameter
)
3. Circuit Breaker Pattern
Implement circuit breaker functionality for MCP tools to prevent cascading failures.

Impact
Severity: High - Crashes entire workflows
Frequency: Common when using external MCP servers
Workaround Complexity: Medium - Requires manual error handling
Additional Context
This issue significantly impacts the reliability of production ADK systems using MCP tools. The current behavior makes it difficult to build robust multi-agent systems that can gracefully handle partial failures.

Related Issues
[Link to any related issues if they exist]
Feature Request
Consider adding:

Built-in error handling options for MCP tools
Fault-tolerant execution modes for multi-agent workflows
Circuit breaker patterns for external tool integrations
Better error propagation and handling documentation
Labels: bug, enhancement, mcp-tools, multi-agent, error-handling</div>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool Failure Crashes Entire ADK Multi-Agent Workflow #3355

Discussed in #795

MCP Tool Failure Crashes Entire ADK Multi-Agent Workflow

Run with input that causes MCP tool to fail

Result: Entire workflow crashes, other_agent1 and other_agent2 never execute

Add option for fault-tolerant execution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tool Failure Crashes Entire ADK Multi-Agent Workflow #3355

Description

Discussed in #795

MCP Tool Failure Crashes Entire ADK Multi-Agent Workflow

Run with input that causes MCP tool to fail

Result: Entire workflow crashes, other_agent1 and other_agent2 never execute

Add option for fault-tolerant execution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions