#miniCodeProps X Hashtag

Explore tweets tagged as #miniCodeProps

Sean Welleck

@wellecks

1 year

Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn

5

35

145

BensenHsu

@BensenHsu

1 year

@wellecks The authors evaluated two baseline approaches for automatically proving the properties in "miniCodeProps": a next-step tactic generation method using the LLMStep framework, and a full proof generation approach using GPT-4. They found that these baselines were able to prove only

0

1

Vlad Ruso PhD

@vlruso

1 year

CMU Researchers Propose miniCodeProps: A Minimal AI Benchmark for Proving Code Properties https://t.co/6EiFpMCwOW #AIinCodeVerification #miniCodeProps #TheoremProving #AutomationInTech #FutureOfProgramming #ai #news #llm #ml #research #ainews #innovation #artificialintelligen…

0

1

0

José A. Alonso

@Jose_A_Alonso

1 year

miniCodeProps: A minimal benchmark for proving code properties. ~ Evan Lohn, Sean Welleck. https://t.co/TGSBqw4rN5 #LLMs #ITP #Lean4 #FunctionalProgramming #Haskell

0

3

7

Software Engineering

@ComputerPapers

1 year

miniCodeProps: a Minimal Benchmark for Proving Code Properties.

0

Bony Bean

@bonybean

1 year

CMU Researchers Propose miniCodeProps: A Minimal AI Benchmark for Proving Code Properties:

0

José A. Alonso

@Jose_A_Alonso

1 year

miniCodeProps: A minimal benchmark for proving properties of code. ~ Evan Lohn & Sean Welleck. https://t.co/4ak8WEYF2z #ITP #LeanProver #Lean4 #LLMs

0

Josh Cason

@TheGrizztronic

1 year

"Despite its simplicity, miniCodeProps is challenging for current LLM-based provers, which succeed in proving about 25 percent of the specifications."

Sean Welleck

@wellecks

1 year

Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn

0

2

Sean Welleck

@wellecks

1 year

5. miniCodeProps: a Minimal Benchmark for Proving Code Properties https://t.co/emuMuSbeBx At Safe Generative AI, Sunday, Exhibition Hall A https://t.co/IdyxVKi5kt

Sean Welleck

@wellecks

1 year

Can LLMs prove that code is correct? New paper: "miniCodeProps: a Minimal Benchmark for Proving Code Properties" https://t.co/emuMuSaGLZ miniCodeProps tests LLMs' ability to prove properties of simple Lean programs. Despite its simplicity, it's challenging! Led by Evan Lohn

0

2