Skip to content

sitar-lab/NeuSight

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NeuSight

NeuSight is a framework designed to predict the performance of deep learning training and inference on various GPUs. For more details, please refer to our paper, Forecasting GPU Performance for Deep Learning Training and Inference.

Installation

To install NeuSight as a Python package, run:

git clone https://github.com/sitar-lab/NeuSight.git
cd NeuSight
pip install -e .
  • NeuSight was tested on Python 3.9 and PyTorch version 2.1.0.

Quick Start

We provide two Python scripts for predicting the latency of deep learning execution and training the predictor:

  • scripts/pred.py for making predictions
  • scripts/train.py for training the predictor

The scripts use GPU and deep learning description files in JSON format, and execution hyperparameters such as batch size. See the scripts in scripts/example for examples of how to use these scripts.

Tool Inputs

NeuSight requires two input files to run: a device configuration file and a deep learning model configuration file.

Device Configuration File

The device configuration file specifies the architectural parameters of the prediction target GPU. Example configuration files can be found in data/device_configs. The configuration file includes

  • Device : User-specified name of the device
  • Dev_Mem : Global memory in GB
  • Mem_Bw : Memory bandwidth of global memory in GB/s
  • Num_Sm : Number of SMs
  • Core_Per_SM : Number of CUDA cores per SM
  • Freq : Compute frequency in GHz
  • SingleFLOPs : Peak FP32 performance in GFLOPS/s
  • L2Cache : Size of L2 Cache in MB

Deep Learning Model Configuration File

The deep learning model configuration file specifies the architectural parameters of the target deep learning model. Example configuration files can be found in data/DLmodel_configs. Configuration file are specified in Hugging Face model description format.

Code Structure

/ : NEUSIGHT_ROOT
|-- neusight                     : Source file directory for NeuSight
|   |-- Dataset                  : For collecting and processing dataset for training
|   |-- Model                    : For machine learning based predictor
|   |-- Opgraph                  : For manipulating operator graphs
|   |-- Prediction               : For NeuSight predictor
|   |-- Tracing                  : For tracing ML model graphs
|-- scripts                      : Main scripts for running NeuSight
|   |-- asplos                   : Training dataset and scripts used for ASPLOS 2025 paper
|   |   |-- data                 : Input files used for NeuSight
|   |   |   |-- dataset          : Datasets used for training and tile table (NVIDIA)
|   |   |   |-- dataset_amd      : Datasets used for training and tile table (AMD)
|   |   |   |-- device_configs   : GPU description files
|   |   |   |-- DLmodel_configs  : DL model description files
|   |   |   |-- predictor        : Model configuration and trained parameters for ML predictor
|   |   |-- label                : Measured latencies for ML models evaluated
|   |   |-- results              : Results of NeuSight prediction
|   |   |-- summary              : Summary of NeuSight prediction
|   |-- example                  : Example scripts for running NeuSight

Citation

If you use NeuSight in your research, please cite our paper:

@inproceedings{neusight,
  author = {Lee, Seonho and Phanishayee, Amar and Mahajan, Divya},
  title = {Forecasting GPU Performance for Deep Learning Training and Inference},
  year = {2025},
  isbn = {9798400706981},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  url = {https://doi.org/10.1145/3669940.3707265},
  doi = {10.1145/3669940.3707265},
  booktitle = {Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1},
  pages = {493–508},
  numpages = {16},
  keywords = {deep learning, gpu performance forecasting, ml for systems, training and inference},
  location = {Rotterdam, Netherlands},
  series = {ASPLOS '25}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •