Explore tweets tagged as #miniCodeProps
@wellecks
Sean Welleck
1 year
Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn
5
35
145
@BensenHsu
BensenHsu
1 year
@wellecks The authors evaluated two baseline approaches for automatically proving the properties in "miniCodeProps": a next-step tactic generation method using the LLMStep framework, and a full proof generation approach using GPT-4. They found that these baselines were able to prove only
0
1
1
@vlruso
Vlad Ruso PhD
1 year
0
1
0
@Jose_A_Alonso
José A. Alonso
1 year
miniCodeProps: A minimal benchmark for proving code properties. ~ Evan Lohn, Sean Welleck. https://t.co/TGSBqw4rN5 #LLMs #ITP #Lean4 #FunctionalProgramming #Haskell
0
3
7
@ComputerPapers
Software Engineering
1 year
miniCodeProps: a Minimal Benchmark for Proving Code Properties.
0
0
0
@bonybean
Bony Bean
1 year
CMU Researchers Propose miniCodeProps: A Minimal AI Benchmark for Proving Code Properties:
0
0
0
@Jose_A_Alonso
José A. Alonso
1 year
miniCodeProps: A minimal benchmark for proving properties of code. ~ Evan Lohn & Sean Welleck. https://t.co/4ak8WEYF2z #ITP #LeanProver #Lean4 #LLMs
0
0
0
@TheGrizztronic
Josh Cason
1 year
"Despite its simplicity, miniCodeProps is challenging for current LLM-based provers, which succeed in proving about 25 percent of the specifications."
@wellecks
Sean Welleck
1 year
Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn
0
0
2
@wellecks
Sean Welleck
1 year
5. miniCodeProps: a Minimal Benchmark for Proving Code Properties https://t.co/emuMuSbeBx At Safe Generative AI, Sunday, Exhibition Hall A https://t.co/IdyxVKi5kt
@wellecks
Sean Welleck
1 year
Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn
0
0
2