VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning

简介

Rust是一种编程语言，它结合了内存安全和底层控制，提供类似于C的性能，同时默认情况下保证不存在未定义的行为。Rust日益普及已经引起了在将现有代码库安全正确地转换为Rust方面的研究。现有的工作分为两类：基于规则和基于大型语言模型(LLM)。虽然基于规则的方法理论上可以产生正确的转换，并保持与原始输入输出的等价性，但它们通常会产生使用Rust语言的不安全子集的难以阅读的Rust代码。另一方面，虽然基于LLM的方法通常会产生更易读、更易维护和更安全的代码，但它们并不提供任何关于正确性的保证。在这项工作中，我们提出了VERT，一种可以产生具有正确性正式保证的易读Rust转换的工具。VERT的唯一要求是源语言需要有Web Assembly编译器，这对于大多数主要语言来说都是成立的。VERT首先使用Web Assembly编译器获得一个oracle Rust程序。同时，VERT使用LLM生成一个易读的候选Rust程序。该候选程序会与oracle进行验证，如果验证失败，我们会生成一个新的候选转换，直到验证成功为止。我们通过转换从竞争编程风格基准测试中提取的1,394个程序对VERT进行评估。将Anthropic的Claude-2和VERT结合使用，相比仅使用Claude，Rust转换通过基于属性的测试从31%提高到54%，通过有界模型检查从1%提高到42%。此外，我们还评估了VERT在使用指针的实际C项目中生成非平凡安全Rust的能力。我们的结果提供了关于LLM编写安全Rust的限制的见解。
图表
解决问题

VERT: Verified Efficient Rust Transpiler
关键思路

VERT is a tool that can produce readable Rust transpilations with formal guarantees of correctness by using an LLM to generate a readable candidate Rust program and verifying it against an oracle Rust program obtained from a Web Assembly compiler for the source language.
其它亮点

VERT was evaluated by transpiling a suite of 1,394 programs taken from competitive programming style benchmarks, and by generating non-trivial safe Rust on programs taken from real-world C projects that make significant use of pointers. Combining Anthropic's Claude-2 and VERT increases Rust transpilations passing property-based testing and bounded model-checking compared to using Claude alone.
相关研究

Existing work on safe and correct transpiling of existing code-bases to Rust falls into two categories: rule-based and large language model (LLM)-based. Rule-based approaches can produce correct transpilations that maintain input-output equivalence to the original, but often yield unreadable Rust code that uses unsafe subsets of the Rust language. LLM-based approaches typically produce more readable, maintainable, and safe code, but do not provide any guarantees about correctness.

VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning

评论