LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

简介

大型语言模型（LLMs），包括专有和开源模型，在解决各种下游任务方面展示了显著的能力。然而，当涉及到实际的中国法律任务时，这些模型未能满足实际要求。专有模型不能保证敏感法律案件的数据隐私，而开源模型由于缺乏法律知识而表现不佳。为了解决这个问题，我们介绍了 LawGPT，这是第一个专门为中国法律应用而设计的开源模型。LawGPT 包括两个关键组成部分：法律导向的预训练和法律监督的微调。具体而言，我们采用大规模的中国法律文件进行法律导向的预训练，以融入法律领域知识。为了进一步提高模型在下游法律任务上的性能，我们创建了一个基于知识驱动的指令数据集，用于法律监督微调。我们的实验结果表明，LawGPT 的表现优于开源模型 LLaMA 7B。我们的代码和资源公开在 https://github.com/pengxiao-song/LaWGPT，并在 GitHub 上获得了 5.7K 星。
图表
解决问题

LawGPT: An Open-Source Language Model for Chinese Legal Applications
关键思路

The paper introduces LawGPT, an open-source language model specifically designed for Chinese legal applications, which incorporates legal domain knowledge through legal-oriented pre-training and legal supervised fine-tuning.
其它亮点

LawGPT outperforms the open-source LLaMA 7B model on downstream legal tasks. The model employs large-scale Chinese legal documents for pre-training and creates a knowledge-driven instruction dataset for fine-tuning. The code and resources are publicly available on GitHub.
相关研究

Recent related studies include the use of LLMs for legal tasks, such as LLaMA and OpenAI's GPT-3, but LawGPT is the first open-source model specifically designed for Chinese legal applications.

LawGPT: A Chinese Legal Knowledge-Enhanced Large Language Model

评论