- In general run everything from the main folder e.g. python3 utils/plot_script/foo.py
- Download mnist.csv and place it as MNIST/mnist.csv (e.g. from https://www.kaggle.com/datasets/oddrationale/mnist-in-csv/data)
- Install PAPI (papi7.1.0) to allow # Flops measurements, # Memory Movements, ...
- Install STREAM to allow measurement of bandwidth
- Modify dimension of PCA on MNIST by editing utils/process_data/pca_mnist.py (OPTIONAL, default = 64)
- Gather the MNIST dataset with PCA plotted features by running python3 utils/process_data/pca_mnist.py
- Compile STREAM with and measure: gcc -O3 -fopenmp -DSTREAM_ARRAY_SIZE=100000000 stream.c -o stream && ./stream, to measure bandwitdh Beta needed for roofline plot
- Modify main.cpp to modify the benchmarks (OPTIONAL) and, accordingly, the data of YOUR machine in implementations/constants.h and utils/plot_script/plot_data.py with information on the system and on the run (e.g. add measured-bandwidth, cache sizes, ...)
- Prepare the system by fixing frequency of the cores (single), disabling Turbo Boost Technology or similar, disable most background processes, and ensure enough warmup and iterations runs are used.
- Run a benchmark by first compiling with g++ main.cpp -o main -lpapi -std=c++20 -O3 -march=native (or if local installation or papi: g++ main.cpp -o main -I/usr/local/papi/include -L/usr/local/papi/lib -lpapi -std=c++20 -O3 -march=native)
- Run ./main and wait for the program to finish
- Visualize the result: python3 utils/plot_script/tsne_plot.py
- Plot a simple benchmarks with theoretical limit: python3 utils/plot_script/performance_plot.py
- Plot a roofline plot of the run: python3 utils/plot_script/roofline_plot.py
- Plot a speedup plot of the run: python3 utils/plot_script/speedup_plot.py