A comprehensive test suite for validating A2A (Agent-to-Agent) Protocol v0.3.0 specification compliance with multi-transport support, progressive validation, and detailed compliance reporting.
The A2A Protocol TCK is a sophisticated validation framework that provides:
- 📋 Categorized Testing: Clear separation of mandatory vs. optional requirements
- 🎯 Capability-Based Validation: Smart test execution based on Agent Card declarations
- 📊 Compliance Reporting: Detailed assessment with actionable recommendations
- 🚀 Progressive Enhancement: Four-tier compliance levels for informed deployment decisions
- 🔄 Multi-Transport Support: Comprehensive testing for JSON-RPC, gRPC, and REST transports
- ✨ A2A v0.3.0 Features: Full support for new authentication schemes, streaming methods, and enhanced security
The TCK transforms A2A specification compliance from guesswork into a clear, structured validation process.
Use the TCK to validate your A2A implementation:
./run_tck.py --sut-url http://localhost:9999 --category all --compliance-report report.jsonUse the TCK to validate your A2A implementation:
-
🔍 Check spec changes:
util_scripts/check_spec_changes.py -
📥 Update baseline:
util_scripts/update_current_spec.py --version "v1.x"
- 🔴 MANDATORY: Must pass for A2A compliance (JSON-RPC 2.0 + A2A core)
- 🔄 CAPABILITIES: Conditional mandatory based on Agent Card declarations
- 🚀 TRANSPORT EQUIVALENCE: Multi-transport functional equivalence (conditional mandatory)
- 🛡️ QUALITY: Production readiness indicators (optional)
- 🎨 FEATURES: Optional implementation completeness (informational)
- Smart Execution: Tests skip when capabilities not declared, become mandatory when declared
- False Advertising Detection: Catches capabilities declared but not implemented
- Honest Validation: Only tests what's actually claimed to be supported
- 🔴 NON_COMPLIANT: Any mandatory failure (Not A2A Compliant)
- 🟡 MANDATORY: Basic compliance (A2A Core Compliant)
- 🟢 RECOMMENDED: Production-ready (A2A Recommended Compliant)
- 🏆 FULL_FEATURED: Complete implementation (A2A Fully Compliant)
- Weighted compliance scoring
- Specification reference citations
- Actionable fix recommendations
- Deployment readiness guidance
- Python: 3.8+
- uv: Recommended for environment management
- SUT: Running A2A implementation with accessible HTTP/HTTPS endpoint
-
Install uv:
# Install uv (see https://github.com/astral-sh/uv#installation) curl -LsSf https://astral.sh/uv/install.sh | sh # Or: pipx install uv # Or: brew install uv
-
Clone and setup:
git clone https://github.com/maeste/a2a-tck.git cd a2a-tck # Create virtual environment uv venv source .venv/bin/activate # Linux/macOS # .venv\\Scripts\\activate # Windows # Install dependencies uv pip install -e .
-
Configure environment (optional):
# Copy example environment file and customize cp .env.example .env # Edit .env to set timeout values and other configuration
-
Start your A2A implementation (System Under Test):
# Example using the included Python SUT cd python-sut/tck_core_agent uv run .
Note: The run_sut.py script requires the PyYAML package. You can install it using uv pip install pyyaml or pip install pyyaml.
To simplify the process of testing various A2A implementations, this TCK includes a utility script run_sut.py. This Python script automates the download (or update), build, and execution of a System Under Test (SUT) based on a configuration file.
SUTs will be cloned or updated into a directory named SUT/ created in the root of this TCK repository.
You need to create a YAML configuration file (e.g., my_sut_config.yaml) to define how your SUT should be handled. A template is available at sut_config_template.yaml.
The configuration file supports the following fields:
sut_name(string, mandatory): A descriptive name for your SUT. This name will also be used as the directory name for the SUT within theSUT/folder (e.g.,SUT/my_agent).github_repo(string, mandatory): The HTTPS or SSH URL of the git repository where the SUT source code is hosted.git_ref(string, optional): A specific git branch, tag, or commit hash to checkout after cloning/fetching. If omitted, the repository's default branch will be used.prerequisites_script(string, mandatory): Path to the script that handles prerequisite installation and building the SUT. This path is relative to the root of the SUT's cloned repository (e.g.,scripts/build.shorsetup/prepare_env.py).prerequisites_interpreter(string, optional): The interpreter to use for theprerequisites_script(e.g.,bash,python3,powershell.exe). If omitted, the script will be executed directly (e.g.,./scripts/build.sh). Ensure the script is executable and has a valid shebang in this case.prerequisites_args(string, optional): A string of arguments to pass to theprerequisites_script(e.g.,"--version 1.2 --no-cache").run_script(string, mandatory): Path to the script that starts the SUT. This path is relative to the root of the SUT's cloned repository (e.g.,scripts/run.shorapp/start_server.py).run_interpreter(string, optional): The interpreter to use for therun_script.run_args(string, optional): A string of arguments to pass to therun_script(e.g.,"--port 8080 --debug").
Example sut_config.yaml:
sut_name: "example_agent"
github_repo: "https://github.com/your_org/example_agent_repo.git"
git_ref: "v1.0.0" # Optional: checkout tag v1.0.0
prerequisites_script: "bin/setup.sh"
prerequisites_interpreter: "bash"
prerequisites_args: "--fast"
run_script: "bin/start.py"
run_interpreter: "python3"
run_args: "--host 0.0.0.0 --port 9000"- Prerequisites Script: This script is responsible for all steps required to build your SUT and install its dependencies. It should exit with a status code of
0on success and any non-zero status code on failure. If it fails,run_sut.pywill terminate. - Run Script: This script should start your SUT. Typically, it will launch a server or application that runs in the foreground. The
run_sut.pyscript will wait for this script to terminate (e.g., by Ctrl+C or if the SUT exits itself). - Directly Executable Scripts: If you omit the
*_interpreterfor a script, ensure the script file has execute permissions (e.g.,chmod +x your_script.sh) and, for shell scripts on Unix-like systems, includes a valid shebang (e.g.,#!/bin/bash).
Once you have your SUT configuration file ready, you can run your SUT using:
python run_sut.py path/to/your_sut_config.yamlFor example:
python run_sut.py sut_configs/my_python_agent_config.yamlThis will:
- Clone the SUT from
github_repointoSUT/<sut_name>/(or update if it already exists). - Checkout the specified
git_ref(if any). - Execute the
prerequisites_scriptwithin the SUT's directory. - Execute the
run_scriptwithin the SUT's directory to start the SUT.
You can then proceed to run the TCK tests against your SUT.
Before running tests, ensure your A2A implementation meets the SUT Requirements. This includes:
- Streaming Duration: Tasks with message IDs starting with
"test-resubscribe-message-id"must run for ≥2 × TCK_STREAMING_TIMEOUTseconds - Environment Variables: Optional support for
TCK_STREAMING_TIMEOUTconfiguration - Test Patterns: Proper handling of TCK-specific message ID patterns
📖 Read Full SUT Requirements →
./run_tck.py --sut-url http://localhost:9999 --category mandatoryResult: ✅ Pass = A2A compliant, ❌ Fail = NOT A2A compliant
./run_tck.py --sut-url http://localhost:9999 --category capabilitiesResult: Ensures declared capabilities actually work (prevents false advertising)
./run_tck.py --sut-url http://localhost:9999 --category transport-equivalenceResult: Ensures functional equivalence across declared transport types (JSON-RPC, gRPC, REST)
./run_tck.py --sut-url http://localhost:9999 --category qualityResult: Identifies issues that may affect production deployment
./run_tck.py --sut-url http://localhost:9999 --category all --compliance-report compliance.jsonResult: Complete assessment with compliance level and recommendations
# Get help and understand test categories
./run_tck.py --explain
# Test specific category
./run_tck.py --sut-url URL --category CATEGORY
# Available categories:
# mandatory - A2A compliance validation (MUST pass)
# capabilities - Capability honesty check (conditional mandatory)
# transport-equivalence - Multi-transport functional equivalence (conditional mandatory)
# quality - Production readiness assessment
# features - Optional feature completeness
# all - Complete validation workflow# Generate detailed compliance report
./run_tck.py --sut-url URL --category all --compliance-report report.json
# Verbose output with detailed logging
./run_tck.py --sut-url URL --category mandatory --verbose
# Generate HTML report (additional)
./run_tck.py --sut-url URL --category all --report
# Skip Agent Card fetching (for non-standard implementations)
./run_tck.py --sut-url URL --category mandatory --skip-agent-cardThe TCK supports A2A v0.3.0 multi-transport architecture with advanced transport selection and testing capabilities:
# Test with specific transport strategy
./run_tck.py --sut-url URL --category all --transport-strategy prefer_jsonrpc
# Force a specific transport via strategy
./run_tck.py --sut-url URL --category all --transport-strategy prefer_grpc
# Enable transport equivalence testing (default: enabled)
./run_tck.py --sut-url URL --category all --enable-equivalence-testing
# Test only transport equivalence with specific configuration
./run_tck.py --sut-url URL --category transport-equivalence \
--transport-strategy all_supported
# Strict transport selection (required transports, no fallback)
./run_tck.py --sut-url URL --category all \
--transport-strategy prefer_grpc \
--transports grpc \
--enable-equivalence-testing
# Run per-transport single-client tests for JSON-RPC and gRPC, then equivalence
./run_tck.py --sut-url URL --category all \
--transports jsonrpc,grpc
# With compliance reports (one per transport; filenames get _jsonrpc/_grpc suffixes)
./run_tck.py --sut-url URL --category all \
--transports jsonrpc,grpc \
--compliance-report reports/compliance.json
### gRPC usage
```bash
./run_tck.py --sut-url http://localhost:9999 --category mandatory --transports grpcTwo complementary options control transport behavior:
Purpose: Restricts which transports are allowed/tested
Effect: Filters available transports before selection
Values: Comma-separated list: jsonrpc,grpc,rest
Default: None (all transports allowed)
Purpose: Defines how to select from available transports
Effect: Controls selection logic after filtering
Values: agent_preferred, prefer_jsonrpc, prefer_grpc, prefer_rest, all_supported
Default: agent_preferred
--transportsfilters which transports can be used--transport-strategyselects from the filtered list
# Only test JSON-RPC (filter + strategy is irrelevant)
--transports jsonrpc
# Test both gRPC and REST, but prefer gRPC when both available
--transports grpc,rest --transport-strategy prefer_grpc
# Test all agent transports, preferring JSON-RPC
--transport-strategy prefer_jsonrpc
# Force strict gRPC-only testing
--transports grpc --transport-strategy prefer_grpcTransport Strategy Options:
agent_preferred(default) - Use agent's preferred transport from Agent Cardprefer_jsonrpc- Prefer JSON-RPC 2.0 over HTTP transportprefer_grpc- Prefer gRPC transport when availableprefer_rest- Prefer HTTP+JSON/REST transport when availableall_supported- Test all supported transports
Transport Types:
jsonrpc- JSON-RPC 2.0 over HTTP (backward compatible)grpc- gRPC with Protocol Buffersrest- HTTP+JSON/REST transport
The TCK supports configuration via environment variables and .env files for flexible timeout and behavior customization.
Setting up environment configuration:
# Copy the example file
cp .env.example .env
# Edit the file to customize settings
nano .env # or your preferred editorAvailable environment variables:
| Variable | Description | Default | Examples |
|---|---|---|---|
TCK_STREAMING_TIMEOUT |
Base timeout for SSE streaming tests (seconds) | 2.0 |
1.0 (fast), 5.0 (slow), 10.0 (debug) |
The TCK supports additional environment variables for A2A v0.3.0 multi-transport configuration:
| Variable | Description | Default | Examples |
|---|---|---|---|
A2A_TRANSPORT_STRATEGY |
Transport selection strategy | agent_preferred |
prefer_jsonrpc, prefer_grpc, all_supported |
A2A_PREFERRED_TRANSPORT |
Preferred transport type | None | jsonrpc, grpc, rest |
A2A_REQUIRED_TRANSPORTS |
Comma-separated required transports (strict) | None | grpc, jsonrpc,rest |
A2A_ENABLE_EQUIVALENCE_TESTING |
Enable transport equivalence testing | true |
true, false, 1, 0 |
A2A_JSONRPC_* |
JSON-RPC specific configuration | - | A2A_JSONRPC_TIMEOUT=30 |
A2A_GRPC_* |
gRPC specific configuration | - | A2A_GRPC_MAX_MESSAGE_SIZE=4MB |
A2A_REST_* |
REST specific configuration | - | A2A_REST_TIMEOUT=60 |
Timeout behavior:
- Short timeout:
TCK_STREAMING_TIMEOUT * 0.5- Used for basic streaming operations - Normal timeout:
TCK_STREAMING_TIMEOUT * 1.0- Used for standard SSE client operations - Async timeout:
TCK_STREAMING_TIMEOUT * 1.0- Used forasyncio.wait_foroperations
Usage examples:
# Use .env file (recommended)
echo "TCK_STREAMING_TIMEOUT=5.0" > .env
./run_tck.py --sut-url URL --category capabilities
# Set directly for single run
TCK_STREAMING_TIMEOUT=1.0 ./run_tck.py --sut-url URL --category capabilities
# Debug with very slow timeouts
TCK_STREAMING_TIMEOUT=30.0 ./run_tck.py --sut-url URL --category capabilities --verbose
# A2A v0.3.0 multi-transport configuration via environment (strict single transport)
A2A_TRANSPORT_STRATEGY=prefer_grpc A2A_REQUIRED_TRANSPORTS=grpc ./run_tck.py --sut-url URL --category all
# Run both JSON-RPC and gRPC per-transport, then equivalence (via env)
A2A_REQUIRED_TRANSPORTS=jsonrpc,grpc ./run_tck.py --sut-url URL --category all
# Complex multi-transport setup in .env file
cat > .env << EOF
TCK_STREAMING_TIMEOUT=3.0
A2A_TRANSPORT_STRATEGY=all_supported
A2A_ENABLE_EQUIVALENCE_TESTING=true
A2A_GRPC_TIMEOUT=30
A2A_JSONRPC_TIMEOUT=15
EOF
./run_tck.py --sut-url URL --category allWhen to adjust timeouts:
- Decrease (
1.0): Fast CI/CD pipelines, local development - Increase (
5.0+): Slow networks, debugging, resource-constrained environments - Debug (
10.0+): Detailed troubleshooting, step-through debugging
Purpose: Validate core A2A specification requirements
Impact: Failure = NOT A2A compliant
Location: tests/mandatory/
Includes:
- JSON-RPC 2.0 compliance (
tests/mandatory/jsonrpc/) - A2A protocol core methods (
tests/mandatory/protocol/) - Agent Card required fields
- Core message/send functionality
- Task management (get/cancel)
Example Failures:
test_task_history_length→ SDK doesn't implement historyLength parametertest_mandatory_fields_present→ Agent Card missing required fields
Purpose: Validate declared capabilities work correctly
Impact: Failure = False advertising
Logic: Skip if not declared, mandatory if declared
Location: tests/optional/capabilities/
Capability Validation:
{
"capabilities": {
"streaming": true, ← Must pass streaming tests
"pushNotifications": false ← Streaming tests will skip
}
}Includes:
- Streaming support (
message/stream,tasks/resubscribe) - Push notification configuration
- File/data modality support
- Authentication methods
Purpose: Validate A2A v0.3.0 multi-transport functional equivalence
Impact: Conditional mandatory (if multiple transports declared)
Logic: Skip if single transport, mandatory if multiple transports declared
Location: tests/optional/multi_transport/
A2A v0.3.0 Functional Equivalence Requirements (per specification §3.4.1):
{
"additionalInterfaces": [
{"url": "...", "transport": "JSONRPC"}, ← Must test equivalence
{"url": "...", "transport": "GRPC"}, ← if multiple declared
{"url": "...", "transport": "HTTP+JSON"}
]
}Validates:
- Identical Functionality: Same operations across all transports
- Consistent Behavior: Semantically equivalent results
- Same Error Handling: Consistent error codes (TaskNotFoundError: -32001)
- Equivalent Authentication: Same auth schemes across transports
- Method Mapping Compliance: Correct transport-specific method names
Purpose: Assess implementation robustness
Impact: Never blocks compliance, indicates production issues
Location: tests/optional/quality/
Quality Areas:
- Concurrent request handling
- Edge case robustness
- Unicode/special character support
- Boundary value handling
- Error recovery and resilience
Purpose: Measure optional feature completeness
Impact: Purely informational
Location: tests/optional/features/
Includes:
- Convenience features
- Enhanced error messages
- SDK-specific capabilities
- Optional protocol extensions
- Criteria: Any mandatory test failure
- Business Impact: Cannot be used for A2A integrations
- Action: Fix mandatory failures immediately
- Criteria: 100% mandatory test pass rate
- Business Impact: Basic A2A integration support
- Suitable For: Development and testing environments
- Next Step: Address capability validation
- Criteria: Mandatory (100%) + Capability (≥85%) + Quality (≥75%)
- Business Impact: Production-ready with confidence
- Suitable For: Staging and careful production deployment
- Next Step: Enhance feature completeness
- Criteria: Capability (≥95%) + Quality (≥90%) + Feature (≥80%)
- Business Impact: Complete A2A implementation
- Suitable For: Full production deployment with confidence
When you run with --compliance-report, you get a JSON report containing:
{
"summary": {
"compliance_level": "RECOMMENDED",
"overall_score": 87.5,
"mandatory_score": 100.0,
"capability_score": 90.0,
"quality_score": 75.0,
"feature_score": 60.0
},
"recommendations": [
"✅ Ready for staging deployment",
"⚠️ Address 2 quality issues before production",
"💡 Consider implementing 3 additional features"
],
"next_steps": [
"Fix Unicode handling in task storage",
"Improve concurrent request performance",
"Consider implementing authentication capability"
]
}#!/bin/bash
# Block deployment if not A2A compliant
./run_tck.py --sut-url $SUT_URL --category mandatory
if [ $? -ne 0 ]; then
echo "❌ NOT A2A compliant - blocking deployment"
exit 1
fi
echo "✅ A2A compliant - deployment approved"#!/bin/bash
# Generate compliance report and make environment-specific decisions
./run_tck.py --sut-url $SUT_URL --category all --compliance-report compliance.json
COMPLIANCE_LEVEL=$(jq -r '.summary.compliance_level' compliance.json)
case $COMPLIANCE_LEVEL in
"NON_COMPLIANT")
echo "❌ Not A2A compliant - blocking all deployments"
exit 1
;;
"MANDATORY")
echo "🟡 Basic compliance - dev/test only"
[[ "$ENVIRONMENT" == "production" ]] && exit 1
;;
"RECOMMENDED")
echo "🟢 Recommended - staging approved"
;;
"FULL_FEATURED")
echo "🏆 Full compliance - production approved"
;;
esacStreaming tests skipping:
# Check Agent Card capabilities
curl $SUT_URL/.well-known/agent.json | jq .capabilities
# If streaming: false, tests will skip (this is correct!)Quality tests failing but compliance achieved:
# This is expected - quality tests don't block compliance
# Address quality issues for production readinessTests not discovering:
# Ensure proper installation
pip install -e .
# Check test discovery
pytest --collect-only tests/mandatory/When debugging specific test failures, you can run individual tests with detailed output:
Run a single test with verbose output and debug information:
# Using run_tck.py with verbose mode (shows print() and logger.info() messages)
python run_tck.py --sut-url http://localhost:9999 --category capabilities --verbose-log
# Run specific test directly with pytest
python -m pytest tests/optional/capabilities/test_streaming_methods.py::test_message_stream_basic \
--sut-url http://localhost:9999 -s -v --log-cli-level=INFORun all tests in a specific file:
python -m pytest tests/optional/capabilities/test_streaming_methods.py \
--sut-url http://localhost:9999 -s -v --log-cli-level=INFODebug options explained:
-s: Showsprint()statements during test execution-v: Verbose test output with detailed test names and outcomes--log-cli-level=INFO: Showslogger.info()and other log messages--tb=short: Shorter traceback format (default in run_tck.py)
Run with different log levels:
# Show DEBUG level logs (very detailed)
python -m pytest tests/path/to/test.py --sut-url URL -s -v --log-cli-level=DEBUG
# Show only WARNING and ERROR logs
python -m pytest tests/path/to/test.py --sut-url URL -s -v --log-cli-level=WARNING- SUT Requirements - Essential requirements for A2A implementations to work with the TCK
- SDK Validation Guide - Detailed usage guide for SDK developers
- Specification Update Workflow - Monitor and manage A2A specification changes
- Test Documentation Standards - Standards for test contributors
- Fork the repository
- Follow Test Documentation Standards
- Add tests with proper categorization and specification references
- Submit pull request with clear specification citations
This project is licensed under the MIT License - see the LICENSE file for details.
Just want A2A compliance?
./run_tck.py --sut-url URL --category mandatoryPlanning production deployment?
./run_tck.py --sut-url URL --category all --compliance-report report.jsonDebugging capability issues?
./run_tck.py --sut-url URL --category capabilities --verboseTesting A2A v0.3.0 multi-transport implementation?
./run_tck.py --sut-url URL --category transport-equivalence --transport-strategy all_supportedWant comprehensive assessment?
./run_tck.py --sut-url URL --explain # Learn about categories first
./run_tck.py --sut-url URL --category all --compliance-report full_report.jsonThe A2A TCK transforms specification compliance from confusion into clarity. 🚀