Skip to content
This repository was archived by the owner on Aug 13, 2025. It is now read-only.
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 63 additions & 24 deletions docs/manual/olares/settings/gpu-resource.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,84 @@
---
outline: [2, 3]
description: Optimize GPU usage in Olares with flexible memory management options. Choose between shared and standalone modes for different resource requirements.
description: Manage and optimize GPU resources in Olares with centralized controls, supporting time-slicing, exclusive access, and VRAM-slicing across single or multi-node setups.
---

# Manage GPU usage
:::info
Only Olares admin can change GPU usage mode. This ensures optimal resource management across the system and prevents conflicts between users' resource needs.
Only Olares admin can configure GPU usage mode. This ensures optimal resource management across the system and prevents conflicts between users' resource needs.
:::

Olares offers flexible GPU memory management to support resource-intensive tasks like image generation and large language models. Users can choose between two modes to best suit their needs: **shared mode** and **standalone mode**.
Olares allows you to harness the full power of your GPUs to accelerate demanding tasks such as large language models, image and video generation, and gaming. Whether your GPUs are on a single node or spread across multiple nodes, you can manage them conveniently from one centralized interface.

This guide helps you understand and configure GPU allocation modes to maximize hardware performance.

## GPU usage modes
:::tip
Use shared mode when running multiple lightweight tasks or when you want to ensure fair resource distribution among users. Switch to standalone mode for complex AI models or high-resolution image generation tasks that require dedicated resources.
::: tip Nvidia GPU only
Currently, only Nvidia GPU is supported.
:::

### Shared mode (default)
## Understand GPU allocation modes

In shared mode, Olares intelligently allocates GPU memory across multiple applications:
Olares supports three GPU allocation modes. Choosing the right mode helps optimize performance based on your needs.

* Applications share up to the maximum GPU memory available on your hardware.
* Tasks are executed in order of request, ensuring fair resource distribution.
* Ideal for users running multiple lightweight GPU tasks simultaneously.
### Time slicing

### Standalone mode
In this mode, the GPU's processing power is shared among multiple applications.

For users requiring dedicated GPU resources, standalone mode can be enabled:
* Acts as a default resource pool. Any application not explicitly assigned to a specific GPU will automatically use a time-slicing GPU if available.

* Applications can request up to the maximum GPU memory available on your hardware exclusively.
* Enhances performance for single, resource-intensive tasks.
* Large memory requests may limit resources available for subsequent applications.
* Suitable for General-purpose use and running multiple lightweight applications.

:::info
For shared applications, such as SD Web UI (Stable Diffusion) and ComfyUI, GPU memory is managed by the shared application itself, not individual user instances.
This means that the GPU mode settings described here do not directly affect reference applications.
### App exclusive

In this mode, the entire GPU processing power and memory is dedicated to a single application.

* Best for intensive, performance-critical applications like AI-generated imagery or high-performance gaming servers.
* Large memory demands may limit availability for other tasks.

### Memory slicing
In this mode, GPU memory (VRAM) is partitioned into fixed, dedicated amounts for specific applications.

* Ideal for running multiple GPU-intensive applications simultaneously, each with guaranteed VRAM allocation.
* Prevents memory conflicts between applications running on the same GPU.

## View GPU status

To view your GPU status:

1. Navigate to **Settings > GPU**. The GPU list shows each GPU’s model, associated node, total VRAM, and current GPU mode.
2. Click on a specific GPU to visit its details.

::: tip Note
If your Olares only contains one GPU, navigating to the GPU section will take you directly to the GPU details page. If you have multiple GPUs, you will see a list first.
:::

## Configure GPU mode

On the **GPU details** page, select your desired mode from the **GPU mode** dropdown. Depending on your selected mode, different follow-up options apply.

* **Time slicing**:
1. Select this mode from the GPU mode dropdown.
2. In the **Application pinning** section, click **+Add an application** button to manually pin an application to this specific GPU in a multi-GPU setup.

:::tip Note
No manual pinning is required if you only have one GPU in your cluster.
:::

* **App exclusive**
1. Select this mode from the GPU mode dropdown.
2. In the **Select exclusive app** dropbox, choose your target application.
3. Click **Confirm**.
![App exclusive](/images/manual/olares/gpu-app-exclusive.png#bordered)

## Change GPU mode for application
1. Open the Settings app from the Dock or Launchpad.
2. Select **System** from the left sidebar, and click **GPU** on the right.
3. In the dropdown **VRAM mode**, select the required GPU usage mode.
* **Memory slicing**
1. Select this mode from the dropdown.
2. In the **Allocate VRAM** section, click **Add application**.
3. Select your target application and assign it a specific amount of VRAM (in GB).
4. Repeat for other applications and click **Confirm**.
![VRAM slicing](/images/manual/olares/gpu-memory-slicing.png#bordered)

::: tip Note
You can't assign a VRAM that's larger than the total VRAM.
:::

## Learn more
- [Monitor GPU usage in Olares](../resources-usage.md)
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
79 changes: 55 additions & 24 deletions docs/zh/manual/olares/settings/gpu-resource.md
Original file line number Diff line number Diff line change
@@ -1,45 +1,76 @@
---
outline: [2, 3]
description: 在 Olares 中选择和配置 GPU 资源管理模式,根据实际需求在共享模式和独占模式间切换,优化资源分配
description: 控制管理和优化 Olares显卡资源,支持单节点或多节点环境下的时间分片、独占和显存分片分配模式。
---

# 管理 GPU 使用
# 管理显卡使用

:::info 注意
只有 Olares 管理员可以更改 GPU 使用模式。这样可以确保系统范围内的资源得到最优化管理,避免用户之间的资源需求冲突。
只有 Olares 管理员可以更改显卡使用模式。这样可以确保系统范围内的资源得到最优化管理,避免用户之间的资源需求冲突。
:::

Olares 提供灵活的 GPU 显存管理功能,以支持图像生成和大语言模型等计算密集型任务。用户可以根据实际需要选择**共享模式**或**独占模式**
Olares 为你提供了强大、灵活的显卡管理功能,让你充分释放 GPU算力,加速大模型推理、图像/视频生成及游戏等高负载任务。不论显卡位于单节点还是跨节点分布,都可在同一界面集中管理

## GPU 使用模式
:::tip
运行多个轻量级任务时,或需要确保用户间公平分配资源时,建议使用共享模式。对于需要专用资源的复杂 AI 模型或高分辨率图像生成任务,可以切换至独占模式。
本文档帮你了解并配置显卡分配模式,发挥硬件最大效能。

::: tip 注意
当前仅支持英伟达显卡。
:::

### 共享模式(默认)
## 显卡分配模式

Olares 提供三种分配方式,可按场景灵活选择。

### 时间分片模式

在此模式下,GPU 的处理能力将在多个应用之间共享。

- 该模式下,GPU 提供默认的显存资源池。未被分配独占 GPU 或专有显存的应用将自动使用时间分片模式下的 GPU(如可用)。
- 适合通用型任务以及同时运行多个轻量级应用。

### 应用独占模式

在共享模式下,Olares 智能分配应用 GPU 显存:
在此模式下,整个 GPU 的计算能力和显存将专用于单个应用。

* 应用共享硬件上可用的最大 GPU 显存
* 按请求顺序执行任务,确保资源分配公平
* 适合同时运行多个轻量级 GPU 任务的用户
- 最适合高性能、资源密集型应用,如 AI 图像生成或高性能游戏服务器。
- 大内存占用可能会限制其他任务的运行。

### 独占模式
### 显存分片模式

需要专用 GPU 资源的用户可以启用独占模式:
在此模式下,GPU 显存(VRAM)被划分为固定配额,分配给指定应用。

* 应用可以独占使用硬件上可用的最大 GPU 显存
* 提升单个资源密集型任务的性能表现
* 大内存请求可能会限制后续应用的可用资源
- 适合同时运行多个显卡密集型应用(如多个 AI 模型),每个应用都有独立显存配额。
- 可避免多个应用运行在同一 GPU 上时的内存冲突。

:::info
对于共享应用,如 SD Web UI(Stable Diffusion)和 ComfyUI,GPU 显存由共享应用自行管理,而不是由单个用户实例管理。
这意味着这里描述的 GPU 模式设置不会直接影响这些授权应用。
## 查看显卡状态

1. 进入 **设置 > GPU**。GPU 列表显示每个显卡的型号、所在节点、总显存及当前分配模式。
2. 点击单个显卡以进入其详情页。

::: tip 注意
如果你的 Olares 集群中只有一块 GPU,进入 GPU 页面将直接跳转至详情页;若有多块 GPU,则会显示 GPU 列表。
:::
:::

## 更改应用的 GPU 模式
1. 打开**设置**,选择**系统** > **GPU**。
2. 在 **VRAM 模式**下拉菜单中选择所需的 GPU 使用模式。
## 配置 GPU

在 **GPU 详情** 页面中,可通过 **GPU 模式** 下拉菜单选择所需模式。不同模式下的配置项略有差异:

- **时间片模式(Time slicing)**:
1. 在 GPU 模式下拉菜单中选择该模式。
2. 如有多个 GPU,可在**绑定应用**窗口,点击 **+添加应用**,手动将应用绑定到该 GPU。
![时间分片](/images/zh/manual/olares/gpu-time-slicing.png#bordered)
* **应用独占**:
1. 在 GPU 模式下拉菜单中选择该模式。
2. 在**选择独占应用**下拉框中选择目标应用。
3. 点击**确认**。
![独占](/images/zh/manual/olares/gpu-app-exclusive.png#bordered)
- **显存切片模式**:
1. 在下拉菜单中选择该模式。
2. 在**分配显存**窗口,点击 **+添加应用**。
3. 选择目标应用,并指定分配给该应用的显存大小(以 GB 为单位)。
4. 如需为其他应用分配显存,可重复以上操作,然后点击 **确认**。
![显存分片](/images/zh/manual/olares/gpu-memory-slicing.png#bordered)

## 了解更多
- [监控 Olares 中的 GPU 使用情况](../resources-usage.md)
- [监控 Olares 中的显卡使用情况](../resources-usage.md)