This repository defines a couple of Model Context Protocol tools for debugging OpenShift nodes and collecting logs. The tools are implemented in pkg/sdkserver/tools.go and can be registered with any MCP server built using the mcp-go SDK.
import (
"github.com/harche/crio-mcp-server/pkg/sdkserver"
"github.com/mark3labs/mcp-go/server"
)
s := server.NewMCPServer("demo", "1.0.0")
sdkserver.RegisterTools(s)Runs oc debug on a specified node and executes arbitrary shell commands inside the temporary debug pod.
Arguments:
node_name(string, required) – node to debugcommands(array of string) – commands executed in the pod (defaults tojournalctl --no-pager -u crio)collect_files(bool) – when true, files listed inpathsare returned as a tarball resourcepaths(array of string) – file or directory paths to copy from the host
When collect_files is enabled, the tool returns a application/tar+gzip archive containing the specified paths.
Streams systemd journal and container runtime logs from a node using oc adm node-logs.
Arguments:
node_name(string, required) – target nodesince(string) – RFC3339 timestamp or relative value accepted byjournalctlcompress(bool) – if true, return logs as a gzip tarball instead of inline text The zipped data is returned as a blob resource namednode-logs-<node>.txt.gz.
Runs go tool pprof with the supplied arguments to inspect CPU or memory profiles. Refer to go tool pprof -h for the full set of options.
Arguments:
args(array of string, required) – command-line arguments passed directly togo tool pprof
Runs oc adm must-gather to capture cluster information. Create a temporary directory and pass it using dest_dir to keep all gathered data in one location. Explore oc adm must-gather -h for the full set of options.
oc adm must-gather can scoop up almost every artifact engineers or support need in a single shot: it exports the full YAML for all cluster-scoped and namespaced resources (Deployments, CRDs, Nodes, ClusterOperators, etc.); captures pod and container logs as well as systemd journal slices from each node to trace runtime crashes or OOMs; grabs API-server and OAuth audit logs for security or compliance forensics; collects kernel, cgroup, and other node sysinfo plus tuned and kubelet configs for performance tuning; optionally runs add-on scripts such as gather_network_logs to archive iptables/OVN flows and CNI pod logs, or gather_profiling_node to fetch 30-second CPU and heap pprof dumps from both kubelet and CRI-O for hotspot analysis; and, through plug-in images, can extend to operator-specific data like storage states or virtualization metrics, ensuring one reproducible tarball contains configuration, logs, network traces, performance profiles, and security audits for thorough offline debugging.
Arguments:
dest_dir(string) – local directory where the must-gather output is storedextra_args(array of string) – additional flags forwarded tooc adm must-gather
These helpers can be integrated into a custom MCP server or used directly with the mcp-go SDK.
Runs sosreport inside a debug pod using toolbox. This captures detailed diagnostics from a node. Provide a Red Hat case ID if available.
Arguments:
node_name(string, required) – node from which to gather the reportcase_id(string) – optional support case identifier passed tososreport
Runs crictl inside a debug pod to interact directly with the node's container runtime. Use the -h flag on any subcommand for help.
crictl is the lightweight command-line client from the cri-tools project that
speaks the Kubernetes Container Runtime Interface (CRI) directly. Because it
talks to the node’s container runtime (CRI-O, containerd, etc.) over the local
/var/run/.sock, it works even when kubelet or the API server are
unhealthy. Common commands include crictl ps and crictl pods to list
running containers or sandboxes, crictl inspect/inspectp for JSON-formatted
metadata, crictl logs to read container stdout, crictl exec for a shell,
crictl images and crictl pull to manage images, crictl stats for live CPU
and memory usage, and crictl runp|create|start to launch test sandboxes.
Because it bypasses Kubernetes control-plane layers, crictl is the first tool
engineers reach for when debugging low-level runtime or cgroup issues on an
OpenShift node.
Arguments:
node_name(string, required) – node on which to run the commandargs(array of string) – arguments forwarded tocrictl(defaults tops)
Drops a debug pod onto a node and walks its unified cgroup-v2 hierarchy. By default it lists memory.current for every pod under /sys/fs/cgroup/kubepods.slice, but you can supply custom commands to inspect other files.
Cgroup files are the ground truth for how the Linux kernel enforces every pod’s CPU, memory, I/O and PIDs limits. Reading cpu.max, memory.max, io.stat, pids.max or pressure-stall metrics straight from /sys/fs/cgroup/kubepods.slice/... lets you verify that the values the kubelet intended actually reached the kernel; spot runaway memory or CPU throttling even when metrics-server is down; correlate CRI-O OOM-kills with mis-configured requests; and confirm that topology-aware features like CPU Manager wrote the right cpuset.cpus mask.
Arguments:
node_name(string, required) – node whose cgroupfs should be inspectedcommands(array of string) – optional shell commands to run inside the debug pod
Runs the gather_network_logs must-gather addon to capture iptables and OVN flows along with CNI pod logs.
Arguments:
dest_dir(string) – directory where the network logs are stored
Collects 30-second CPU and heap profiles from kubelet and CRI-O using the gather_profiling_node script.
Arguments:
dest_dir(string) – directory where the profiling output is written
Fetches recent Kubernetes events from all namespaces via oc get events -A.
Runs oc adm top nodes to gather CPU and memory usage for each node.
Executes a PromQL query against the cluster Prometheus service via oc get --raw.
Arguments:
query(string, required) – the PromQL expression to run
Retrieves logs from a specific pod similar to oc logs.
Arguments:
namespace(string, required) – namespace of the podpod_name(string, required) – pod to read logs fromcontainer(string) – optional container within the podsince(string) – optional duration (e.g.5m) to limit logs
Uses oc debug to print kubelet and CRI-O configuration files from the node.
Arguments:
node_name(string, required) – node to inspect
Queries the Red Hat Knowledge Base using the Case Management API.
Arguments:
query(string, required) – search keywordsrows(number) – number of results to return (default 20)offline_token(string, required) – offline access token for authentication
Retrieves CVE details from the Red Hat Security Data API.
Arguments:
cve_id(string, required) – identifier likeCVE-2025-1234