The Dawn of Natural Language to SQL: Are We Fully Ready?

简介

将用户的自然语言问题转化为SQL查询语句（即NL2SQL）显著降低了访问关系数据库的门槛。大语言模型的出现引入了NL2SQL任务中的一种新范式，极大地增强了能力。然而，这引发了一个关键问题：我们是否已经完全准备好在生产中部署NL2SQL模型？为了回答这个问题，我们提出了一个多角度的NL2SQL评估框架NL2SQL360，以便为研究人员设计和测试新的NL2SQL方法。通过NL2SQL360，我们在不同的数据领域和SQL特征等一系列应用场景下对主要的NL2SQL方法进行了详细比较，为选择特定需求的最合适的NL2SQL方法提供了有价值的见解。此外，我们探索了NL2SQL的设计空间，利用NL2SQL360自动识别适合用户特定需求的最佳NL2SQL解决方案。具体而言，NL2SQL360通过执行准确性指标在Spider数据集下区分出了一种有效的NL2SQL方法SuperSQL。值得注意的是，SuperSQL在Spider和BIRD测试集上分别达到了87％和62.66％的执行准确率，表现出有竞争力的性能。
图表
解决问题

NL2SQL models are becoming more capable with the emergence of Large Language Models, but the question is whether we are ready to deploy them in production. This paper presents an evaluation framework, NL2SQL360, to facilitate the design and testing of NL2SQL methods across various application scenarios.
关键思路

The NL2SQL360 framework automates the identification of an optimal NL2SQL solution tailored to user-specific needs. The framework was used to identify an effective NL2SQL method, SuperSQL, which achieved competitive performance on the Spider and BIRD test sets.
其它亮点

The paper offers valuable insights for selecting the most appropriate NL2SQL methods for specific needs, and explores the NL2SQL design space. The experiments were conducted on various data domains and SQL characteristics. The SuperSQL method achieved execution accuracy of 87% and 62.66% on the Spider and BIRD test sets, respectively. The paper also provides open-source code for NL2SQL360.
相关研究

Recent related studies include 'Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning' and 'SQLNet: Generating Structured Queries from Natural Language Without Reinforcement Learning'.

The Dawn of Natural Language to SQL: Are We Fully Ready?

评论