Skip to content

Commit 7728159

Browse files
authored
Merge pull request #3832 from alibaba/feature/sync
MNN:Sync: Sync Internal 3.2.3
2 parents 8f175e2 + 88a027a commit 7728159

File tree

91 files changed

+4909
-2148
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

91 files changed

+4909
-2148
lines changed

.gitignore

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -376,4 +376,6 @@ datasets/*
376376

377377
# qnn 3rdParty
378378
source/backend/qnn/3rdParty/include
379-
apps/Android/MnnLlmChat/release_outputs
379+
project/android/.cxx
380+
pymnn/android/.cxx/
381+
pymnn/android/.cxx/abi_configuration_5u53tc49.json

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,6 @@
1111

1212

1313
## News 🔥
14-
- [2025/08/08] Now we support [gpt-oss-20b](./apps/Android/MnnLlmChat/README.md#releases).
15-
- [2025/08/05] MNN Chat Android is availabe in [GooglePlay](https://play.google.com/store/apps/details?id=com.alibaba.mnnllm.android.release) !
1614
- [2025/06/11] New App MNN TaoAvatar released, you can talk with 3DAvatar offline with LLM, ASR, TTS, A2BS and NNR models all run local on your device!! [MNN TaoAvatar](./apps/Android/Mnn3dAvatar/README.md)
1715
<p align="center">
1816
<img width="20%" alt="Icon" src="https://meta.alicdn.com/data/mnn/avatar/avatar_demo.gif" style="margin: 0 10px;">

docs/inference/module.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -183,6 +183,21 @@ struct Info {
183183
const Info* getInfo() const;
184184
```
185185

186+
### 获取设备信息
187+
调用`getDeviceInfo`函数可获取`Device`信息,可以参考代码:
188+
```cpp
189+
std::string soc_id, dsp_arch;
190+
bool success = MNN::Express::Executor::RuntimeManager::getDeviceInfo("dsp_arch", MNN_FORWARD_NN, dsp_arch);
191+
if(success) {
192+
MNN_PRINT("Device dsp_arch: %s\n", dsp_arch.c_str());
193+
}
194+
195+
success = MNN::Express::Executor::RuntimeManager::getDeviceInfo("soc_id", MNN_FORWARD_NN, soc_id);
196+
if(success) {
197+
MNN_PRINT("Device soc_id: %s\n", soc_id.c_str());
198+
}
199+
```
200+
186201
### 执行推理
187202
调用`onForward`执行推理。
188203

docs/inference/npu.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,10 @@ adb push ${MNN_ROOT}/source/backend/qnn/3rdParty/lib/hexagon-v${HEXAGON_ARCH}/un
5858
adb shell "cd /data/local/tmp && LD_LIBRARY_PATH=/data/local/tmp ADSP_LIBRARY_PATH=/data/local/tmp ./MyExe.out"
5959
```
6060

61+
### QNN量化功能说明
62+
- 仅权重量化(激活是浮点):只支持Linear权重int8、channel-wise的对称量化。
63+
- 激活&权重都量化:支持激活per-tensor对称量化,权重是int8/int4、channel-wise的对称量化。
64+
6165
## CoreML
6266
适用于 Mac / iOS / iPad
6367

docs/tools/convert.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -388,3 +388,12 @@ npu model path:./qnn_smolvlm_model.bin
388388
389389
./ModuleBasic.out qnn_smolvlm_model.mnn dir 0 0 10
390390
```
391+
### 生成多种QNN设备模型脚本
392+
tools/script/genQNNModelsFromMNN.py中提供了8Gen1 ~ 8Elite设备的QNN模型生成脚本
393+
```
394+
// 使用示例
395+
cd mnn_path
396+
cd build
397+
python3 ../tools/script/genQNNModelsFromMNN.py --config_path ../source/backend/qnn/convertor/config_example/ --graph_name visual_qnn --qnn_sdk_root_path /mnt/2Tpartition/tianbu/QNN/qairt/2.37.0.250724/ --src_model visual.mnn --executable_path ./MNN2QNNModel
398+
```
399+
后续将在qnn_models文件夹下生成8Gen1 ~ 8Elite设备的QNN模型产物。

docs/transformers/llm.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -106,17 +106,23 @@ optional arguments:
106106
mnn quant bit, 4 or 8, default is 4.
107107
--quant_block QUANT_BLOCK
108108
mnn quant block, 0 mean channle-wise, default is 128.
109+
--visual_quant_bit VISUAL_QUANT_BIT
110+
mnn visual model quant bit, 4 or 8, default is setting in utils/vision.py by different vit model.
111+
--visual_quant_block VISUAL_QUANT_BLOCK
112+
mnn visual model quant block, 0 mean channle-wise, default is setting in utils/vision.py by different vit model.
109113
--lm_quant_bit LM_QUANT_BIT
110114
mnn lm_head quant bit, 4 or 8, default is `quant_bit`.
111115
--mnnconvert MNNCONVERT
112116
local mnnconvert path, if invalid, using pymnn.
113117
--ppl Whether or not to get all logits of input tokens.
114118
--awq Whether or not to use awq quant.
115119
--sym Whether or not to using symmetric quant (without zeropoint), defualt is False.
120+
--visual_sym Whether or not to using symmetric quant (without zeropoint) for visual model, defualt is False.
116121
--seperate_embed For lm and embed shared model, whether or not to sepearte embed to avoid quant, defualt is False, if True, embed weight will be seperate to embeddingbf16.bin.
117122
--lora_split Whether or not export lora split, defualt is False.
118123
```
119124

125+
120126
### 权重读取
121127
llmexport.py 同时支持 LLM 的验证功能,有较多的依赖。在没有相应环境的情况下,MNN-LLM也提供由 safetensors 或 gguf 文件读取权重的工具,可以降低内存需求,提高转换速度。使用方法如下:
122128

@@ -166,6 +172,7 @@ python3 gguf2mnn.py --gguf ~/third/llama.cpp/build/ggml-model-Q4_K.gguf --mnn_di
166172
```
167173
-DLLM_SUPPORT_VISION=true -DMNN_BUILD_OPENCV=true -DMNN_IMGCODECS=true
168174
```
175+
169176
- 需要开启音频功能时,增加相关编译宏
170177
```
171178
-DLLM_SUPPORT_AUDIO=true -DMNN_BUILD_AUDIO=true
@@ -195,6 +202,12 @@ cd project/android
195202
mkdir build_64
196203
../build_64.sh -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true -DMNN_USE_LOGCAT=true
197204
```
205+
高通设备部分视觉模型支持NPU功能,可增加`MNN_QNN``MNN_WITH_PLUGIN`的宏启用QNN功能。
206+
```
207+
cd project/android
208+
mkdir build_64
209+
../build_64.sh -DMNN_LOW_MEMORY=true -DMNN_CPU_WEIGHT_DEQUANT_GEMM=true -DMNN_BUILD_LLM=true -DMNN_SUPPORT_TRANSFORMER_FUSE=true -DMNN_ARM82=true -DMNN_OPENCL=true -DMNN_QNN=true -DMNN_WITH_PLUGIN=true -DMNN_USE_LOGCAT=true
210+
```
198211

199212
#### iOS: 参考 transformers/llm/engine/ios/README.md
200213
```

express/Executor.cpp

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -281,6 +281,17 @@ bool Executor::RuntimeManager::getInfo(Interpreter::SessionInfoCode code, void*
281281
return false;
282282
}
283283

284+
bool Executor::RuntimeManager::getDeviceInfo(const std::string& deviceKey, const MNNForwardType type, std::string& deviceValue) {
285+
auto creator = MNNGetExtraRuntimeCreator(type);
286+
if (creator != nullptr) {
287+
auto res = creator->onGetDeviceInfo(deviceKey, deviceValue);
288+
if(res) {
289+
return true;
290+
}
291+
}
292+
return false;
293+
}
294+
284295
Executor::RuntimeManager::RuntimeManager() {
285296
mInside = new RuntimeAttr;
286297
mInside->mContent.reset(new RuntimeAttr::Immutable);

include/MNN/expr/Executor.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,7 @@ class MNN_PUBLIC Executor {
130130
void setHint(Interpreter::HintMode mode, int* value, size_t size);
131131
void setHintPtr(Interpreter::HintMode mode, void* value);
132132
bool getInfo(Interpreter::SessionInfoCode code, void* ptr);
133+
static bool getDeviceInfo(const std::string& deviceKey, const MNNForwardType type, std::string& deviceValue);
133134
BackendConfig* getBnConfig();
134135
const RuntimeAttr* getInside() const {
135136
return mInside;

project/android/CMakeExports.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,9 @@ set_target_properties( MNNOpenCV
2525
PROPERTIES IMPORTED_LOCATION
2626
${CMAKE_CURRENT_LIST_DIR}/libs/${ANDROID_ABI}/libMNNOpenCV.so
2727
)
28+
29+
add_library(MNN_LLM SHARED IMPORTED GLOBAL )
30+
set_target_properties(MNN_LLM
31+
PROPERTIES IMPORTED_LOCATION
32+
${CMAKE_CURRENT_LIST_DIR}/libs/${ANDROID_ABI}/libllm.so
33+
)

project/android/gradle/wrapper/gradle-wrapper.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ distributionBase=GRADLE_USER_HOME
33
distributionPath=wrapper/dists
44
zipStoreBase=GRADLE_USER_HOME
55
zipStorePath=wrapper/dists
6-
distributionUrl=http\://mtl-gradle-mirror.oss-cn-hangzhou.aliyuncs.com/gradle-4.6-all.zip
6+
distributionUrl=http://mtl-gradle-mirror.oss-cn-hangzhou.aliyuncs.com/gradle-6.7.1-all.zip

0 commit comments

Comments
 (0)