feat: Add kubectl plugin support and enhance EKS aperf script #323

chrismld · 2025-10-15T15:59:09Z

Summary

This PR transforms the EKS aperf script into a kubectl plugin and adds several enhancements to improve usability and automation.

Changes

1. Kubectl Plugin Support

Renamed eks-aperf.sh → kubectl-aperf to follow kubectl plugin naming conventions
Users can now install and run as: kubectl aperf [options]
Added installation instructions to README-EKS.md

2. Enhanced Script Features

Automatic Taint Handling

Automatically detects node taints using kubectl get node and jq
Dynamically generates tolerations in the pod spec
Enables aperf to run on tainted nodes without manual configuration

Configurable Report Names

Added --report-name parameter (default: aperf_record)
Allows users to specify custom names for better organization
Example: --report-name="loadtest_run1"

Automatic Report Extraction and Browser Opening

Automatically extracts the tar.gz report after collection
Opens index.html in the default browser (configurable via --open-browser)
Cross-platform support for macOS (open) and Linux (xdg-open)
Default behavior: --open-browser=true

Improved Error Handling

Added graceful handling for missing metrics-server
Shows helpful messages when kubectl top is unavailable
Better error messages with debug output

3. Documentation Updates

Updated all script references from eks-aperf.sh to kubectl-aperf
Added kubectl plugin installation section
Added Krew plugin manager documentation (for future distribution)
Updated usage examples throughout README-EKS.md

New Parameters

Parameter	Type	Default	Description
`--report-name`	string	`aperf_record`	Custom name for aperf record/report
`--open-browser`	boolean	`true`	Automatically open report in browser

Testing

Tested on:

✅ macOS (Darwin) with tainted EKS nodes
✅ Automatic browser opening on macOS
✅ Custom report names
✅ Nodes with multiple taints

Breaking Changes

None. The script maintains backward compatibility with all existing parameters.

Future Work

Submit to official Krew index for wider distribution
Add support for multiple nodes in a single run
Add report comparison features

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…a kubectl plugin

salvatoredipietro · 2025-10-27T16:03:16Z

eks-aperf.sh

    --namespace) dest="NAMESPACE";;
    --aperf_options) dest="APERF_OPTIONS";;
    --aperf_image) dest="APERF_IMAGE";;
-    --cpu-request) dest="CPU_REQUEST";;


Any reason why you remove such options?

because aperf is going to be as short-lived pod, we don't need the node to guarantee the resources it requests ... aperf won't be restricted by that and will be scheduled to the node I want. The default values were requesting too much resources anyway, isn't aperf just consuming around ~5% of CPU?

let's change the default CPU_REQUEST and MEMORY_REQUEST values so we do not need to remove the feature.
Do we have some better values to set to them?

Sorry to disagree here. When using Karpenter, the nodes you get are the ones your application need, usually you don't have too much room left to schedule a pod in that node. Karpenter's purpose is to launch a node with as less waste as possible. Even if we lower down the requests, users will face problems when the node launched was too efficient that there's no room guaranteed for another pod, the kube-scheduler will fail to schedule a pod. If not Karpenter but managed node groups, the pod might or not might fit into the node. This of course is related to requested resources, not utilization, that's why I removed this setup because in this case you'll be treating aperf pod as another process in the node as you'd normally do in a server without containers. Once scheduled, the aperf pod will use the resources it needs.

We need to make sure the aperf pod can be scheduled in the node, we don't really need to guarantee resources for the pod. In a Karpenter setup, if we define a number of resources here, we'll have to re-schedule the app you need to profile along with affinities to the aperf pod to make sure both pods are scheduled in the same node.

I see but some customers can be worry about the impact APerf can have on the node in a production env.
To not remove this feature, can we consider to not set up any request/limit values/setting for both memory and CPU and add them only when customer explicitly pass these options to the command line?

sounds good to me, I've pushed a new change to make these values optional

salvatoredipietro · 2025-10-27T16:06:12Z

kubectl-aperf


+# Get node taints and generate tolerations
+echo -e "${BOLD}Checking node taints...${NC}"
+TAINTS=$(kubectl get node ${NODE_NAME} -o jsonpath='{.spec.taints[*]}' 2>/dev/null)


we probably need to add a check if the NODE exists before check the taint, otherwise if user enter a wrong node, it fails without a clear reason. What do you think?

salvatoredipietro · 2025-10-27T16:11:02Z

kubectl-aperf

+    if [[ "$OSTYPE" == "darwin"* ]]; then
+      # macOS
+      open "$INDEX_FILE"
+    elif [[ "$OSTYPE" == "linux-gnu"* ]]; then


Windows support?

salvatoredipietro · 2025-10-27T16:19:33Z

kubectl-aperf

 POD_NAME="aperf-pod-${NODE_NAME//[.]/-}"

+# Get node taints and generate tolerations
+echo -e "${BOLD}Checking node taints...${NC}"


This goes to a new line, can we have the result on a single line?

bash ./kubectl-aperf --aperf_image="${APERF_ECRREPO}:latest" --node="i-087ae37512508cca5" Checking node taints... No taints found on node

salvatoredipietro · 2025-10-27T16:21:01Z

kubectl-aperf

+  grep "$(kubectl get pods --all-namespaces --field-selector spec.nodeName=${NODE_NAME} -o jsonpath='{range .items[*]}{.metadata.name}{" "}{end}' | sed 's/[[:space:]]*$//' | sed 's/[[:space:]]/\\|/g')" /tmp/allpods.out --color=never || echo "  No pods found on this node"
+  rm /tmp/allpods.out 2>/dev/null || true
+else
+  echo "  ${YELLOW}Note: kubectl top not available (metrics-server may not be installed)${NC}"


Formatting is wrong here and can we have it on a single line?

Resource usage for pods on i-087ae375125081ba5: \033[0;33mNote: kubectl top not available (metrics-server may not be installed)\033[0m

I removed the new line ... also, I'm not sure why you get the color numbers ... those variables were already in the script (from line 19 to 25) and I just use them.

…ting

chrismld

@salvatoredipietro thanks for reviewing!! I've made a few changes to address your feedback, let me know your thoughts

chrismld · 2025-11-03T12:55:13Z

eks-aperf.sh

    --namespace) dest="NAMESPACE";;
    --aperf_options) dest="APERF_OPTIONS";;
    --aperf_image) dest="APERF_IMAGE";;
-    --cpu-request) dest="CPU_REQUEST";;


because aperf is going to be as short-lived pod, we don't need the node to guarantee the resources it requests ... aperf won't be restricted by that and will be scheduled to the node I want. The default values were requesting too much resources anyway, isn't aperf just consuming around ~5% of CPU?

chrismld · 2025-11-03T13:03:30Z

kubectl-aperf

+  grep "$(kubectl get pods --all-namespaces --field-selector spec.nodeName=${NODE_NAME} -o jsonpath='{range .items[*]}{.metadata.name}{" "}{end}' | sed 's/[[:space:]]*$//' | sed 's/[[:space:]]/\\|/g')" /tmp/allpods.out --color=never || echo "  No pods found on this node"
+  rm /tmp/allpods.out 2>/dev/null || true
+else
+  echo "  ${YELLOW}Note: kubectl top not available (metrics-server may not be installed)${NC}"


I removed the new line ... also, I'm not sure why you get the color numbers ... those variables were already in the script (from line 19 to 25) and I just use them.

Added features to the EKS script and added instructions to use it as …

6d84870

…a kubectl plugin

chrismld requested a review from a team as a code owner October 15, 2025 15:59

removed requests and limits as they're not needed

2bb1f64

CongkaiTan requested a review from salvatoredipietro October 23, 2025 19:37

salvatoredipietro reviewed Oct 27, 2025

View reviewed changes

Addressing comments from latest PR to support Windows, and fix format…

ddaf0b7

…ting

chrismld commented Nov 3, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main'

8e6a2ed

chrismld requested a review from salvatoredipietro November 3, 2025 13:14

chrismld added 2 commits November 5, 2025 11:54

make the CPU and memory requests/limits optional

abe93ba

Merge remote-tracking branch 'upstream/main'

91e5e92

feat: Add kubectl plugin support and enhance EKS aperf script #323

Are you sure you want to change the base?

feat: Add kubectl plugin support and enhance EKS aperf script #323

Conversation

chrismld commented Oct 15, 2025

Summary

Changes

1. Kubectl Plugin Support

2. Enhanced Script Features

Automatic Taint Handling

Configurable Report Names

Automatic Report Extraction and Browser Opening

Improved Error Handling

3. Documentation Updates

New Parameters

Testing

Breaking Changes

Future Work

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

salvatoredipietro Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chrismld left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

salvatoredipietro Oct 27, 2025 •

edited

Loading