来自今天的爱可可AI前沿推介
[CL] Natural Language to Code Generation in Interactive Data Science Notebooks
P Yin, W Li, K Xiao, A Rao, Y Wen, K Shi, J Howland, P Bailey, M Catasta, H Michalewski, A Polozov, C Sutton
[Google]
交互式数据科学笔记本中用自然语言生成代码
要点:
-
计算式笔记本,如Jupyter Notebooks,是数据科学家用于数据整理和数据分析任务的交互式计算环境; -
创建了ARCADE,一个用数据科学笔记本中pandas数据分析框架的1082个代码生成问题的基准; -
开发了PaChiNCo,一种针对Python计算笔记本的62B代码语言模型(LM),显著优于开放代码语言模型。
摘要:
计算式笔记本,如Jupyter Notebooks,是数据科学家常用的交互式计算环境,可以执行数据整理和数据分析任务。为了衡量人工智能与程序员配对开发的性能,即根据用户的自然语言(NL)意图自动合成这些任务的代码,本文构建了ARCADE,用数据科学笔记本中的pandas数据分析框架解决1082个代码生成问题的基准。ARCADE具有来自同一笔记本的多轮自然语言到编码问题。其需要一个模型来理解丰富的多模态上下文,例如现有的笔记本单元格及其执行状态以及之前的交互回合。为了在这项具有挑战性的任务上建立强大的基线,本文开发了PaChiNCo,Python计算笔记本的62B代码语言模型(LM),其性能明显优于开放代码语言模型。本文探索了少样本提示策略,通过分步分解和自然语言解释来获得更好的代码,显示了提高模型预测多样性和可解释性的潜力。
Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1082 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions.
论文链接:https://arxiv.org/abs/2212.09248



内容中包含的图片若涉及版权问题,请及时与我们联系删除


评论
沙发等你来抢