posts/2024/importing-yi9b-to-ollama/ #8
Replies: 1 comment 1 reply
-
|
我是LLM领域的菜鸟,也从未接错过机器学习、深度学习这些内容。 最近,我开始尝试了解LLM以及如何微调他们。过程中,我得知这些大模型通常会以 通过搜索,我得知需要对它进行SFT微调,我已经完成了这方面的工作,只不过模型没上传到HuggingFace等公共平台上,因为我发现的使用LLaMA-Factory进行LoRA微调时,学习率设置太大导致模型训练效果没有收敛。 效果如何?坦率的讲,效果很差,但可以像和ChatGPT对话那样进行交流了,不过有些时候它还是表现的像个Base模型那样自说自话。我注意到你最新更新的版本已经可以完成对话了,不知道你后面是否还进行了其他工作?期待您的指点。 下面是我的脚本: # SFT微调, 让模型可以进行Chat任务
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--do_train True \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--finetuning_type lora \
--quantization_bit 4 \
--template yi \
--dataset_dir data \
--dataset belle_2m \
--cutoff_len 1024 \
--learning_rate 0.0002 \
--num_train_epochs 3.0 \
--max_samples 20000 \
--per_device_train_batch_size 6 \
--gradient_accumulation_steps 1 \
--lr_scheduler_type cosine \
--max_grad_norm 1.0 \
--logging_steps 5 \
--save_steps 100 \
--warmup_steps 50 \
--neftune_noise_alpha 5 \
--optim adamw_torch \
--packing True \
--report_to none \
--output_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora \
--fp16 True \
--lora_rank 8 \
--lora_alpha 16 \
--lora_dropout 0.1 \
--lora_target q_proj,v_proj \
--plot_loss True
# 命令行试用模型, 用于测试模型是否可以正常工作
CUDA_VISIBLE_DEVICES=0 python src/cli_demo.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--quantization_bit 4 \
--finetuning_type lora
# 对模型进行评分, 执行失败, A10 的显存不足
CUDA_VISIBLE_DEVICES=0 python src/evaluate.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--quantization_bit 4 \
--finetuning_type lora \
--task mmlu \
--split test \
--lang zh \
--n_shot 5 \
--batch_size 4
# 对模型进行评分, 执行成功, 效果如下
CUDA_VISIBLE_DEVICES=0 python src/train_bash.py \
--stage sft \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--finetuning_type lora \
--quantization_bit 4 \
--template yi \
--dataset_dir data \
--dataset alpaca_gpt4_zh \
--cutoff_len 1024 \
--max_samples 2000 \
--per_device_eval_batch_size 16 \
--predict_with_generate True \
--max_new_tokens 128 \
--top_p 0.7 \
--temperature 0.95 \
--output_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--do_predict True
***** predict metrics *****
predict_bleu-4 = 12.0712
predict_rouge-1 = 34.153
predict_rouge-2 = 12.641
predict_rouge-l = 23.7601
predict_runtime = 0:38:24.18
predict_samples_per_second = 0.868
predict_steps_per_second = 0.054
# 合并模型
# DO NOT use quantized model or quantization_bit when merging lora weights
CUDA_VISIBLE_DEVICES=0 python src/export_model.py \
--model_name_or_path /mnt/workspace/LLaMA-Factory/Yi-9B-200K \
--adapter_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/ \
--template yi \
--finetuning_type lora \
--export_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora/models \
--export_size 4 \
--export_legacy_format False
# 对模型进行GPTQ 4bit量化, 执行失败, 显存不足
#!/bin/bash
CUDA_VISIBLE_DEVICES=0 python src/export_model.py \
--model_name_or_path saves/Yi-9B/lora/yi-9b-200k-chat-lora/models \
--template yi \
--export_dir saves/Yi-9B/lora/yi-9b-200k-chat-lora-int4/models \
--export_quantization_bit 4 \
--export_quantization_dataset data/c4_demo.json \
--export_size 1 \
--export_legacy_format False |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
posts/2024/importing-yi9b-to-ollama/
The log of importing Yi-9B LLM model to Ollama library.
https://shinyzhu.com/posts/2024/importing-yi9b-to-ollama/
Beta Was this translation helpful? Give feedback.
All reactions