GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
-
Updated
Nov 21, 2025 - Python
GraphGen: Enhancing Supervised Fine-Tuning for LLMs with Knowledge-Driven Synthetic Data Generation
[NeurIPS 2025 Spotlight] ReasonFlux (long-CoT), ReasonFlux-PRM (process reward model) and ReasonFlux-Coder (code generation)
代码大模型 预训练&微调&DPO 数据处理 业界处理pipeline sota
SyGra - Graph-oriented Synthetic data generation Pipeline
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
Add a description, image, and links to the sft-data topic page so that developers can more easily learn about it.
To associate your repository with the sft-data topic, visit your repo's landing page and select "manage topics."