Skip to content

Conversation

@wussh
Copy link

@wussh wussh commented Jun 18, 2025

Description

This pull request adds a new benchmark task to k8s-bench that tests kubectl-ai's ability to create a Kubernetes CronJob resource. This expands test coverage to include more Kubernetes resource types.

Changes

  • Created a new benchmark task in k8s-bench/tasks/create-cronjob/
  • Implemented task.yaml with a prompt to create a CronJob that runs at midnight
  • Added setup.sh, cleanup.sh, and verify.sh scripts for the benchmark
  • Designed verify.sh to be robust and handle different valid implementations

Benefits

  • Tests kubectl-ai's ability to work with CronJob resources
  • Validates understanding of cron schedule syntax
  • Adds a medium-difficulty benchmark to the existing suite
  • Follows the best practices outlined in the k8s-bench contribution guide

Testing

The benchmark has been prepared according to the format of other benchmark tasks and includes comprehensive verification logic.

Checklist

  • Task follows the structure of existing benchmark tasks
  • Task includes proper setup, cleanup, and verification scripts
  • Verification script handles multiple valid implementation approaches
  • Scripts are executable

@mikebz mikebz requested a review from noahlwest July 30, 2025 23:10
@@ -0,0 +1,3 @@
#!/usr/bin/env bash
# Create the namespace for the test
kubectl create namespace create-cronjob-test No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we adjust the namespace to just create-cronjob? Including test, eval, etc. can affect how the model handles the request.


# Verify schedule is set to midnight
# Accept either 0 0 * * * or @daily
SCHEDULE=$(kubectl get cronjob data-backup -n create-cronjob-test -o jsonpath='{.spec.schedule}')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nitpick: Is it reasonable to do a single kubectl get call, save the output, and check against that for the verification checks? The intent you have here is clear, so if that optimization would make things less readable or too tricky, feel free to disregard this comment.

@droot
Copy link
Member

droot commented Nov 12, 2025

Since we are moving k8s-bench to its own repo, closing this out.

@droot droot closed this Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants