Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 2 additions & 8 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,9 +60,8 @@
'sphinx.ext.doctest',
'sphinx.ext.mathjax',
'sphinx.ext.viewcode',
#'sphinx.ext.autosectionlabel',
'sphinx_copybutton',
'myst_parser'
'myst_parser',
]

# Add any paths that contain templates here, relative to this directory.
Expand All @@ -87,18 +86,14 @@
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
html_logo = '_static/SevenNet_logo.png'
html_context = {
# ...
'default_mode': 'light'
}
html_context = {'default_mode': 'light'}

html_theme_options = {
'use_edit_page_button': False,
'header_links_before_dropdown': 3,
'navbar_end': ['navbar-icon-links'],
'logo': {
'text': ' Documentation',
#"image_light": "_static/SevenNet_logo.png",
},
'icon_links': [
{
Expand All @@ -109,7 +104,6 @@
},
],
'show_nav_level': 4,
#"primary_sidebar_end": ["indices.html", "sidebar-ethical-ads.html"]
}

# -- Options for intersphinx extension ---------------------------------------
Expand Down
68 changes: 59 additions & 9 deletions docs/source/user_guide/accelerator.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,29 @@
# Accelerators

CuEquivariance and flashTP provide acceleration for both SevenNet training and inference. For Benchmark results, follow [here](https://arxiv.org/abs/2510.11241)
This document describes available accelerator integrations in SevenNet and their installation guide.

## CuEquivariance
:::{caution}
We do not support CuEquivariance for [LAMMPS: Torch](./lammps_mliap.md). You must use [LAMMPS: ML-IAP](./lammps_torch.md) for CuEquivariance.
:::

[CuEquivariance](https://github.com/NVIDIA/cuEquivariance) and [FlashTP](https://openreview.net/forum?id=wiQe95BPaB) provide acceleration for both SevenNet training and inference. (For speed, check the section 2.7 of [SevenNet-Omni paper](https://arxiv.org/abs/2510.11241))

:::{tip}
For small systems, FlashTP with [LAMMPS: Torch](./lammps_mliap.md) shows performance advantage over cuEquivariance with [LAMMPS: ML-IAP](./lammps_torch.md).
A performance crossover occurs at around 10³ atoms, beyond which cuEquivariance becomes more efficient.

FlashTP with [LAMMPS: Torch](./lammps_mliap.md) is generally faster than FlashTP with [LAMMPS: ML-IAP](./lammps_mliap.md).
:::

## [CuEquivariance](https://github.com/NVIDIA/cuEquivariance)

CuEquivariance is an NVIDIA Python library designed to facilitate the construction of high-performance geometric neural networks using segmented polynomials and triangular operations. For more information, refer to [cuEquivariance](https://github.com/NVIDIA/cuEquivariance).
CuEquivariance is an NVIDIA Python library designed to facilitate the construction of high-performance geometric neural networks using segmented polynomials and triangular operations. CuEquivariance accelerates SevenNet during training, inference with ASE and LAMMPS via ML-IAP.

### Requirements
- Python >= 3.10
- cuEquivariance >= 0.6.1

Install via:

### Installation
```bash
pip install sevenn[cueq12] # cueq11 for CUDA version 11.*
```
Expand All @@ -22,20 +34,58 @@ causing `pynvml.NVMLError_NotSupported: Not Supported`.
Then try a lower cuEquivariance version, such as 0.6.1.
:::

## FlashTP

FlashTP is a high-performance Tensor-Product library for Machine Learning Interatomic Potentials (MLIPs). For more information and the installation guide, refer to [flashTP](https://github.com/SNU-ARC/flashTP).
If `pip install sevenn[cueq12]` fails to install the latest version of SevenNet, try installing the base package instead:
```
pip install sevenn
```

If this successfully installs the latest version, the issue is likely related to **cuEquivariance compatibility**.
You can verify this by installing cuEquivariance manually:
```
pip install cuequivariance-ops-torch-cu12
pip install cuequivariance-torch
```

For more details, see the [cuEquivariance documentation](https://github.com/NVIDIA/cuEquivariance).


## [FlashTP](https://github.com/SNU-ARC/flashTP)

FlashTP, presented in [FlashTP: Fused, Sparsity-Aware Tensor Product for Machine Learning Interatomic Potentials](https://openreview.net/forum?id=wiQe95BPaB), is a high-performance Tensor-Product library for Machine Learning Interatomic Potentials (MLIPs).

FlashTP accelerates SevenNet during both training and inference, achieving up to ~4× speedup.

### Requirements
- Python >= 3.10
- flashTP >= 0.1.0
- CUDA toolkit >= 12.0

### Installation
Choose `CUDA_ARCH_LIST` for your GPU(s) (see [compute compatibility](https://developer.nvidia.com/cuda/gpus))

```bash
git clone https://github.com/SNU-ARC/flashTP.git
cd flashTP
pip install -r requirements.txt
CUDA_ARCH_LIST="80;90" pip install . --no-build-isolation
```

:::{note}
During installation of flashTP,
During installation of FlashTP,
`subprocess.CalledProcessError: ninja ... exit status 137`
typically indicates **out-of-memory** during compilation.
typically indicates out-of-memory during compilation.
Try reducing the build parallelism:
```bash
export MAX_JOBS=1
```
:::

For more information, see [FlashTP](https://github.com/SNU-ARC/flashTP).

## Usage
After the installation, you can leverage the accelerator with appropriate flag (`--enable_cueq`) or options

- [Training](./cli.md#sevenn-train)
- [ASE Calculator](./ase_calculator.md)
- [LAMMPS](./cli.md#sevenn-get-model)
4 changes: 2 additions & 2 deletions docs/source/user_guide/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ The command is for LAMMPS integration of SevenNet. It deploys a model into a LAM

See {doc}`lammps_torch` or {doc}`lammps_mliap` for installation.

### LAMMPS/Torch
### LAMMPS: PyTorch
The checkpoint can be deployed as LAMMPS potentials. The argument is either the path to the checkpoint or the name of a pretrained potential.

```bash
Expand All @@ -92,7 +92,7 @@ sevenn get_model {checkpoint path} -p
This will create a directory with several `deployed_parallel_*.pt` files. The directory path itself is an argument for the LAMMPS script. Please do not modify or remove files in the directory.
These models can be used as LAMMPS potentials to run parallel MD simulations with a GNN potential across multiple GPUs.

### LAMMPS/ML-IAP
### LAMMPS: ML-IAP
```bash
sevenn get_model 7net-0 --use_mliap # For pre-trained models
sevenn get_model {checkpoint path} --use_mliap # For user-trained models
Expand Down
2 changes: 1 addition & 1 deletion docs/source/user_guide/d3.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
We support a GPU-accelerated implementation of Grimme's D3 dispersion (van der Waals) correction using CUDA, which can be used with ASE and LAMMPS in conjunction with SevenNet. This follows the implementation of [Grimme's D3 method](https://doi.org/10.1063/1.3382344). We have ported the code from the [original fortran code](https://www.chemie.uni-bonn.de/grimme/de/software/dft-d3). While D3 method is significantly faster than DFT, existing CPU implementations were slower than SevenNet. To address this, we have adopted CUDA and single precision (FP32) operations to accelerate the code.

:::{caution}
Currently, the D3 implementaion does not support mulit-GPU or multi-core parallelism.
Currently, the D3 implementation does not support mulit-GPU or multi-core parallelism.
:::
:::{caution}
The implementation requires a GPU with a [compute capability](https://developer.nvidia.com/cuda/gpus) of **at least 6.0**.
Expand Down
Loading