MDIL-SNU · greendragon920 · Dec 16, 2025 · Dec 24, 2025 · Dec 24, 2025 · Dec 24, 2025
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -60,9 +60,8 @@
     'sphinx.ext.doctest',
     'sphinx.ext.mathjax',
     'sphinx.ext.viewcode',
-    #'sphinx.ext.autosectionlabel',
     'sphinx_copybutton',
-    'myst_parser'
+    'myst_parser',
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -87,18 +86,14 @@
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ['_static']
 html_logo = '_static/SevenNet_logo.png'
-html_context = {
-   # ...
-   'default_mode': 'light'
-}
+html_context = {'default_mode': 'light'}
 
 html_theme_options = {
     'use_edit_page_button': False,
     'header_links_before_dropdown': 3,
     'navbar_end': ['navbar-icon-links'],
     'logo': {
         'text': ' Documentation',
-        #"image_light": "_static/SevenNet_logo.png",
     },
     'icon_links': [
         {
@@ -109,7 +104,6 @@
         },
     ],
     'show_nav_level': 4,
-    #"primary_sidebar_end": ["indices.html", "sidebar-ethical-ads.html"]
 }
 
 # -- Options for intersphinx extension ---------------------------------------

diff --git a/docs/source/user_guide/accelerator.md b/docs/source/user_guide/accelerator.md
@@ -1,17 +1,29 @@
 # Accelerators
 
-CuEquivariance and flashTP provide acceleration for both SevenNet training and inference. For Benchmark results, follow [here](https://arxiv.org/abs/2510.11241)
+This document describes available accelerator integrations in SevenNet and their installation guide.
 
-## CuEquivariance
+:::{caution}
+We do not support CuEquivariance for [LAMMPS: Torch](./lammps_mliap.md). You must use [LAMMPS: ML-IAP](./lammps_torch.md) for CuEquivariance.
+:::
+
+[CuEquivariance](https://github.com/NVIDIA/cuEquivariance) and [FlashTP](https://openreview.net/forum?id=wiQe95BPaB) provide acceleration for both SevenNet training and inference. (For speed, check the section 2.7 of [SevenNet-Omni paper](https://arxiv.org/abs/2510.11241))
+
+:::{tip}
+For small systems, FlashTP with [LAMMPS: Torch](./lammps_mliap.md) shows performance advantage over cuEquivariance with [LAMMPS: ML-IAP](./lammps_torch.md).
+A performance crossover occurs at around 10³ atoms, beyond which cuEquivariance becomes more efficient.
+
+FlashTP with [LAMMPS: Torch](./lammps_mliap.md) is generally faster than FlashTP with [LAMMPS: ML-IAP](./lammps_mliap.md).
+:::
+
+## [CuEquivariance](https://github.com/NVIDIA/cuEquivariance)
 
-CuEquivariance is an NVIDIA Python library designed to facilitate the construction of high-performance geometric neural networks using segmented polynomials and triangular operations. For more information, refer to [cuEquivariance](https://github.com/NVIDIA/cuEquivariance).
+CuEquivariance is an NVIDIA Python library designed to facilitate the construction of high-performance geometric neural networks using segmented polynomials and triangular operations. CuEquivariance accelerates SevenNet during training, inference with ASE and LAMMPS via ML-IAP.
 
 ### Requirements
 - Python >= 3.10
 - cuEquivariance >= 0.6.1
 
-Install via:
-
+### Installation
 ```bash
 pip install sevenn[cueq12]  # cueq11 for CUDA version 11.*
 ```
@@ -22,20 +34,58 @@ causing `pynvml.NVMLError_NotSupported: Not Supported`.
 Then try a lower cuEquivariance version, such as 0.6.1.
 :::
 
-## FlashTP
 
-FlashTP is a high-performance Tensor-Product library for Machine Learning Interatomic Potentials (MLIPs). For more information and the installation guide, refer to [flashTP](https://github.com/SNU-ARC/flashTP).
+If `pip install sevenn[cueq12]` fails to install the latest version of SevenNet, try installing the base package instead:
+```
+pip install sevenn
+```
+
+If this successfully installs the latest version, the issue is likely related to **cuEquivariance compatibility**.
+You can verify this by installing cuEquivariance manually:
+```
+pip install cuequivariance-ops-torch-cu12
+pip install cuequivariance-torch
+```
+
+For more details, see the [cuEquivariance documentation](https://github.com/NVIDIA/cuEquivariance).
+
+
+## [FlashTP](https://github.com/SNU-ARC/flashTP)
+
+FlashTP, presented in [FlashTP: Fused, Sparsity-Aware Tensor Product for Machine Learning Interatomic Potentials](https://openreview.net/forum?id=wiQe95BPaB), is a high-performance Tensor-Product library for Machine Learning Interatomic Potentials (MLIPs).
+
+FlashTP accelerates SevenNet during both training and inference, achieving up to ~4× speedup.
 
 ### Requirements
 - Python >= 3.10
 - flashTP >= 0.1.0
+- CUDA toolkit >= 12.0
+
+### Installation
+Choose `CUDA_ARCH_LIST` for your GPU(s) (see [compute compatibility](https://developer.nvidia.com/cuda/gpus))
+
+```bash
+git clone https://github.com/SNU-ARC/flashTP.git
+cd flashTP
+pip install -r requirements.txt
+CUDA_ARCH_LIST="80;90" pip install . --no-build-isolation
+```
 
 :::{note}
-During installation of flashTP,
+During installation of FlashTP,
 `subprocess.CalledProcessError: ninja ... exit status 137`
-typically indicates **out-of-memory** during compilation.
+typically indicates out-of-memory during compilation.
 Try reducing the build parallelism:
 ```bash
 export MAX_JOBS=1
 ```
 :::
+
+For more information, see [FlashTP](https://github.com/SNU-ARC/flashTP).
+
+## Usage
+After the installation, you can leverage the accelerator with appropriate flag (`--enable_cueq`) or options
+
+- [Training](./cli.md#sevenn-train)
+- [ASE Calculator](./ase_calculator.md)
+- [LAMMPS](./cli.md#sevenn-get-model)
diff --git a/docs/source/user_guide/cli.md b/docs/source/user_guide/cli.md
@@ -72,7 +72,7 @@ The command is for LAMMPS integration of SevenNet. It deploys a model into a LAM
 
 See {doc}`lammps_torch` or {doc}`lammps_mliap` for installation.
 
-### LAMMPS/Torch
+### LAMMPS: PyTorch
 The checkpoint can be deployed as LAMMPS potentials. The argument is either the path to the checkpoint or the name of a pretrained potential.
 
 ```bash
@@ -92,7 +92,7 @@ sevenn get_model {checkpoint path} -p
 This will create a directory with several `deployed_parallel_*.pt` files. The directory path itself is an argument for the LAMMPS script. Please do not modify or remove files in the directory.
 These models can be used as LAMMPS potentials to run parallel MD simulations with a GNN potential across multiple GPUs.
 
-### LAMMPS/ML-IAP
+### LAMMPS: ML-IAP
 ```bash
 sevenn get_model 7net-0 --use_mliap  # For pre-trained models
 sevenn get_model {checkpoint path} --use_mliap  # For user-trained models

diff --git a/docs/source/user_guide/d3.md b/docs/source/user_guide/d3.md
@@ -4,7 +4,7 @@
 We support a GPU-accelerated implementation of Grimme's D3 dispersion (van der Waals) correction using CUDA, which can be used with ASE and LAMMPS in conjunction with SevenNet. This follows the implementation of [Grimme's D3 method](https://doi.org/10.1063/1.3382344). We have ported the code from the [original fortran code](https://www.chemie.uni-bonn.de/grimme/de/software/dft-d3). While D3 method is significantly faster than DFT, existing CPU implementations were slower than SevenNet. To address this, we have adopted CUDA and single precision (FP32) operations to accelerate the code.
 
 :::{caution}
-Currently, the D3 implementaion does not support mulit-GPU or multi-core parallelism.
+Currently, the D3 implementation does not support mulit-GPU or multi-core parallelism.
 :::
 :::{caution}
 The implementation requires a GPU with a [compute capability](https://developer.nvidia.com/cuda/gpus) of **at least 6.0**.