王耀祖, 李擎, 戴张杰, 徐越. 大语言模型研究现状与趋势[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2023.10.09.003
引用本文: 王耀祖, 李擎, 戴张杰, 徐越. 大语言模型研究现状与趋势[J]. 工程科学学报. DOI: 10.13374/j.issn2095-9389.2023.10.09.003
Current Status and Trends in Large Language Modeling Research[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2023.10.09.003
Citation: Current Status and Trends in Large Language Modeling Research[J]. Chinese Journal of Engineering. DOI: 10.13374/j.issn2095-9389.2023.10.09.003

大语言模型研究现状与趋势

Current Status and Trends in Large Language Modeling Research

  • 摘要: 在过去20年中,语言建模已经成为一种主要方法,用于语言理解和生成,同时作为自然语言处理(Natural Language Processing,NLP)领域下游的关键技术受到广泛关注。近年来,大语言模型(Large Language Models,LLMs),例如ChatGPT等技术,取得了显著进展,对人工智能乃至其他领域的变革和发展产生了深远的影响。鉴于LLMs迅猛的发展,本文首先对LLMs相关技术架构和模型规模等方面的演进历程进行了全面综述,其次总结了模型训练方法、优化技术以及评估手段。随后,文章重点概述了LLMs在垂直领域专业化技术手段,并分析了LLMs在教育、医疗、金融、工业等领域的应用现状,同时讨论了它们的优势和局限性。此外,本文还探讨了大语言模型针对社会伦理、隐私和安全等方面引发的安全性与一致性问题及技术措施。最后,文章展望了大语言模型未来的研究趋势,包括模型的规模与效能、多模态处理、社会影响等方面的发展方向。通过全面分析当前研究状况和未来走向,本文旨在为研究者提供关于大语言模型的深刻见解和启发。

     

    Abstract: Over the past two decades, language modeling has emerged as a primary methodology for both language understanding and generation. It has garnered significant scholarly attention as a pivotal component within the field of Natural Language Processing (NLP). In recent years, the advent of Large Language Models (LLMs), exemplified by technologies like ChatGPT, has marked remarkable advancements and exerted profound influence on the transformation and progression of artificial intelligence and other domains. Given the rapid evolution of LLMs, this paper initiates with a comprehensive review of the developmental trajectory pertaining to LLMs, including technical architecture and model scalability. Subsequently, it consolidates various aspects of model training techniques, optimization methodologies, and evaluation criteria. The paper proceeds to highlight specialized technical tools prevalent within this field, scrutinizing the current application landscape of LLMs across domains such as education, healthcare, finance, and industry. It delves into an analysis of their strengths and limitations. Furthermore, the paper undertakes an in-depth exploration of critical issues precipitated by large language models in the realms of social ethics, privacy, and security, while also proposing viable solutions to address these concerns. In conclusion, the paper provides an outlook on future research directions for large language models, encompassing areas like model scalability and efficiency, multimodal processing, and their societal impact. By conducting a comprehensive assessment of the present research landscape and forthcoming trends, this paper aspires to furnish researchers with profound insights and inspiration in the realm of large language models.

     

/

返回文章
返回