Add a script to plot benchmark results #314

bernhardmgruber · 2026-02-05T09:40:31Z

This script is 100% vibe-coded. I only looked at the result, since I don't understand Python. Here is a generated summary of what this PR proposes:

Add nvbench_plot.py to plot horizontal bar charts from NVBench JSON results using %SOL (global BW utilization).
Support filtering by axis values (-a with optional [pow2]) and benchmark name (-b).
Improve plot presentation with per-benchmark colors, tighter vertical spacing, and sensible default title.

Examples filtering by axis and benchmark name:

python ./python/scripts/nvbench_plot.py -b nstream -b triad -a Elements{io}[pow2]=28 .
/babelstream_ublkcp_B200.json

Examples with two files, filtering by axis, dark mode:

python ./python/scripts/nvbench_plot.py -a Elements{io}[pow2]=28 -a T{ct}=F32 ./babels
tream_ublkcp_B200.json ./pytorch_ublkcpy_B200.json --dark

Here are the input files for the plots:
pytorch_ublkcpy_B200.json
babelstream_ublkcp_B200.json

Here are the prompts that I used:

“Please study the scripts in the directory python/scripts. Then look at the benchmark result babelstream_ublkcp_B200.json. Create a new script nvbench_plot.py in the above directory that can read the benchmark result and plot a horizontal bar chart, where each benchmark state is plotted as a separate box. The box length is taken from nv/cold/bw/global/utilization”
“If I run the script like python ./python/scripts/nvbench_plot.py ./babelstream_ublkcp_B200.json it errors, please fix the error”
“Let's add a command line parameter -a where I can specify an axis name and value like -a T{ct}=I8. When specified only benchmark states are shown where the axis_value node in the JSON has an axis named T{ct} with a value of I8”
“Please add special handling to the -a argument handling where when I enter an like -a Elements{io}[pow2]=28 the [pow2] is parsed before the = and it converts the value after the = (28 in this case) to 2 to the power of that value (2^28 in this case)”
“Let's add another command line parameter -b where i can select the benchmark name”
“Make the bar color #76b900”
“Please do not plot the file name next to the bar charts”
“Please include the device name in the bar labels”
“Let's remove the device name again”
“Please remove the Device=N part from the bar label”
“Can you use different colors for the bars per benchmark.”
“Add a plot tile "%SOL Bandwidth". If the device of all benchmarks is the same, put the device name in the benchmark title like "%SOL Bandwidth - B200"”
“I get the warning

The get_cmap function was deprecated in Matplotlib 3.7 and will be removed in 3.11. Use matplotlib.colormaps[name] or matplotlib.colormaps.get_cmap() or pyplot.get_cmap() instead.

Please fix it”
14. “Please reduce the empty space above the first bar and below the last bar”
15. “Please write a pull request description about what this new script can do”
16. "Add --dark mode with black background and white text."
17. "Ensure title is visible in dark mode."
18. "Remove the x‑axis description label."
19. "If -a is used, remove those axis tokens from labels and add them to the title."
20. “Please read through the nvbench code here and figure out whether it supports -a paramters where we can also specify multiple values”
21. “Please implement this”
22. “If multiple values are specified for an axis, retain the axis value in the box label”

bernhardmgruber added 3 commits February 5, 2026 10:36

Add a script to plot benchmark results

0be190b

More

ccde9fc

I have no idea what I am doing

ec97590

bernhardmgruber mentioned this pull request Feb 5, 2026

Extend nvbench_compare.py with --plot, axis/benchmark filtering, and dark mode #315

Draft

bernhardmgruber added 2 commits February 5, 2026 14:00

Implement dark mode using style sheets

28ed32b

Feedback from review

d3a0bec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a script to plot benchmark results #314

Add a script to plot benchmark results #314

Uh oh!

bernhardmgruber commented Feb 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add a script to plot benchmark results #314

Are you sure you want to change the base?

Add a script to plot benchmark results #314

Uh oh!

Conversation

bernhardmgruber commented Feb 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bernhardmgruber commented Feb 5, 2026 •

edited

Loading