ChatGPT大模型技术发展与应用
作者:
作者单位:

1.南京航空航天大学计算机科学与技术学院,南京 211106;2.模式分析与机器智能工业和信息化部重点实验室(南京航空航天大学),南京 211106

作者简介:

通讯作者:

基金项目:


Large Language Model ChatGPT: Evolution and Application
Author:
Affiliation:

1.College of Computer Science and Technology, Nanjing University of Aeronautics & Astronautics, Nanjing 211106, China;2.MIIT Key Laboratory of Pattern Analysis and Machine Intelligence (Nanjing University of Aeronautics & Astronautics), Nanjing 211106, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    通过回顾深度学习、语言模型、语义表示和预训练技术的发展历程,全面解析了ChatGPT的技术渊源和演进路线。在语言模型方面,从早期的N-gram统计方法逐步演进到神经网络语言模型,通过对机器翻译技术的研究和影响也催生了Transformer的出现,继而又推动了神经网络语言模型的发展。在语义表示和预训练技术发展方面,从早期的TF-IDF、pLSA和LDA等统计方法发展到Word2Vec等基于神经网络的词向量表示,继而发展到ELMo、BERT和GPT-2等预训练语言模型,预训练框架日益成熟,为模型提供了丰富的语义知识。GPT-3的出现揭示了大语言模型的潜力,但依然存在幻觉问题,如生成不可控、知识谬误及逻辑推理能力差等。为了缓解这些问题,ChatGPT通过指令学习、监督微调、基于人类反馈的强化学习等方式在GPT-3.5上进一步与人类进行对齐学习,效果不断提升。ChatGPT等大模型的出现,标志着该领域技术进入新的发展阶段,为人机交互以及通用人工智能的发展开辟了新的可能。

    Abstract:

    This paper comprehensively analyzes the technical origins and evolution of ChatGPT by reviewing the development of deep learning, language models, semantic representation and pre-training techniques. In terms of language models, the early N-gram statistical method gradually evolved into the neural network language models. Researches and advancements on machine translation also led to the emergence of Transformer, which in turn catalyzed the development of neural network language models. Recording semantic representation and pre-training techniques, there has been an evolution from early statistical methods such as TF-IDF, pLSA and LDA, to neural network-based word vector representations like Word2Vec, and then to pre-trained language models, like ELMo, BERT and GPT-2. The pre-training frameworks have become increasingly sophisticated, providing rich semantic knowledge for models. The emergency of GPT-3 revealed the potential of large language models, but hallucination problems like uncontrollable generation, knowledge fallacies and poor logical reasoning capability still existed. To alleviate these problems, ChatGPT aligned further with humans on GPT-3.5 through instruction learning, supervised fine-tuning, and reinforcement learning from human feedback, continuously improving its capabilities. The emergency of large language models like ChatGPT signifies this field entering a new developmental stage, opening up new possibilities for human-computer interaction and general artificial intelligence.

    参考文献
    相似文献
    引证文献
引用本文

夏润泽,李丕绩. ChatGPT大模型技术发展与应用[J].数据采集与处理,2023,38(5):1017-1034

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2023-08-20
  • 最后修改日期:2023-09-15
  • 录用日期:
  • 在线发布日期: 2023-10-16