武信：一种垂直领域大语言模型系统架构设计与实证

doi:10.16337/j.1004-9037.2025.03.006

首页 > 按月查看>2025年第3月 >637-646. DOI:10.16337/j.1004-9037.2025.03.006

武信：一种垂直领域大语言模型系统架构设计与实证
DOI:
                        10.16337/j.1004-9037.2025.03.006
                    
作者:
                        
                        
                    
作者单位:1.北方自动控制技术研究所，太原030006;2.武警工程大学重点实验室，西安 710086
作者简介:
通讯作者:
基金项目:国家社会科学基金（2022-SKJJ-C-093）; 武警部队科技创新团队创新研究项目（ZZKY20222103）。

Wuxin： Architecture Design and Empirical Study for Vertical-Domain Large Language Model System

Author:

Affiliation:

1.North Automatic Control Technology Institute, Taiyuan 030006, China;2.Key Laboratory of CTC &IE (Engineering University of PAP),Ministry of Education, Xi’an 710086, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

在定制化应用场景下亟需提升大语言模型（Large language models， LLMs）在特定垂直领域的语言理解和生成能力。本文提出一种适用于垂直领域的大语言模型系统开发范式——武信。其涵盖架构、数据、模型和训练等大语言模型系统的系列开发方法，利用人在回路的数据增强提升军事训练伤问答数据集的质量，采用梯度低秩投影（GaLore）策略对轻量级基座大语言模型进行高效全参微调。实验结果表明，所采用的全参微调方法在收敛性和准确性指标上优于主流的LoRA微调，所训练的武信大模型在军事训练伤防治专业知识理解、克服“幻觉”等方面优势明显，相关成果可为垂直领域问答大模型系统设计与应用提供参考。

Abstract:

In customized scenarios， it is urgent to enhance the understanding and generation capabilities of large language models （LLMs） in specific vertical domains. We propose a paradigm for developing vertical-domain LLM system named “Wuxin”， which covers a series of development methods for LLM systems， including architecture， data， model， and training. Wuxin utilizes human-in-the-loop data augmentation to improve the quality of military training injury question and answer datasets， and employs the GaLore strategy to perform efficient full-parameter fine-tuning on small LLMs. Experimental results show that the adopted full-parameter fine-tuning method outperforms LoRA fine-tuning in terms of convergence and accuracy. Furthermore，Wuxin demonstrates significant advantages in understanding professional military training injury knowledge， as well as overcoming hallucinations. Our achievements can provide references for the design and application of question-answering LLM systems in vertical domains.

参考文献

相似文献

引证文献

引用本文

朱新立,高志强,姬纬通,李少华,李松杰.武信：一种垂直领域大语言模型系统架构设计与实证[J].数据采集与处理,2025,40(3):637-646

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:2025-01-26
最后修改日期:2025-03-21
录用日期:
在线发布日期: 2025-06-13

引用本文

分享

文章指标

历史