Niels Mündler (@ ICML) @nielstron X Profile

Niels Mündler (@ ICML)

@nielstron

Followers

597

Following

3K

Media

191

Statuses

1K

CS PhD @eth. Language Models, Code, Formal verification. Compiling Python to FP @OpShinDev. Ex-Founder.

Switzerland

Joined November 2012

Don't wanna be here? Send us removal request.

Niels Mündler (@ ICML)

@nielstron

6 days

And this is why benchmark scores keep increasing but my GitHub issues still require manual labor

0

1

Niels Mündler (@ ICML)

@nielstron

9 days

RT @mbalunovic: We've just released the largest open dataset of expert-annotated LLM proofs! Using this dataset, we did bunch of experiment….

0

4

0

Niels Mündler (@ ICML)

@nielstron

12 days

RT @ni_jovanovic: There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image ge….

0

54

0

Niels Mündler (@ ICML)

@nielstron

18 days

PLDI'25 is now and I am looking forward to present "Type-Constrained Code Generation with Language Models" in 2 hours (4pm Seoul Time). If you can't make it, feel free check out this thread, which summarizes the key points, or chat me up about any questions.

Niels Mündler (@ ICML)

@nielstron

2 months

Excited to present my upcoming PLDI paper at the ICML Workshops DL4C and VerifAI!. Type systems are useful at preventing bugs - so why not leverage them for LLMs? Using constrained decoding, we reduce reduce compiler errors of TypeScript code by over 50%! More details in the 🧵.

0

1

4

Niels Mündler (@ ICML)

@nielstron

1 month

Yudkowsky scream going from ultrasonic to. .

Sakana AI

@SakanaAILabs

1 month

Introducing The Darwin Gödel Machine: AI that improves itself by rewriting its own code. The Darwin Gödel Machine (DGM) is a self-improving agent that can modify its own code. Inspired by evolution, we maintain an expanding lineage of agent variants,

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

I am beyond humbled to see my paper there! glad that people like it :).

0

2

Niels Mündler (@ ICML)

@nielstron

2 months

so apparently non-big-AI-lab papers *can* trend on Hacker News too.

Even Better Cameron Pfiffer 📎

@cameron_pfiffer

2 months

Suspicious. We put out this tweet thread and then we have the paper on the cover of hn? Correlation, or causation. You decide

1

6

Niels Mündler (@ ICML)

@nielstron

2 months

RT @dottxtai: Very cool paper. @jingxuan_he.

0

5

0

Niels Mündler (@ ICML)

@nielstron

2 months

RT @mark_veroe: If you are at ICLR, come by our posters in the DL4C (Garnet 218-219) and BuildingTrust (Hall 4 #6) workshops. We have now e….

0

3

0

Niels Mündler (@ ICML)

@nielstron

2 months

8/ This work was done in collaboration with my co-lead @jingxuan_he and @MogicianTony @dawnsongtweets @koushik77 @mvechev jointly between @the_sri_lab.and @UCBerkeley. We will present it at ICLR 25 workshops DL4C and VerifAI in Singapore! Excited to discuss it there :).

0

2

Niels Mündler (@ ICML)

@nielstron

2 months

7/ Most importantly, this method is orthogonal to model scaling - it can be applied on top of however-good models there are. We also see that even SOTA GPT-4o, o1 and R1 make type errors in more complex tasks and languages such as Rust.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

6/ The work pays off! For Gemma 2 27B, Qwen 2.5 and DeepSeek Coder we see a drop of around 80% in compiler errors.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

5/ Most difficult is deciding whether a partial expression *can* be completed to have a specific type. We implement a search over possible completion types and a heuristic to prune the search space effectively.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

4/ We introduce prefix automata to enable reasoning over *eventual* type-safety. These automata track all possible ASTs of a partial program as states. They guarantee reachability of an accepting state (i.e., type-safe program) as long as one AST is found.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

3/ We propose leveraging the type system to constrain generation. We prevent sampling tokens from the LLM that violate type-safety. This is hard, appropriate constraints have to be built from scratch for every language. We choose TypeScript as a target due to its popularity.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

2/ Prior work demonstrated using constrained decoding to guarantee syntactic correctness, or enforce simple constraints such as correct SQL column names. But syntax errors make up only 3.5% of all code errors in our experiments, and only 6% of compiler errors.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

1/ For all the details, check out the paper at We provide a detailed explanation of our constraining approach and extensive evaluation of 6 SOTA open-weight Code-LLMs - which all improve through our method!.

1

0

1

Niels Mündler (@ ICML)

@nielstron

2 months

Excited to present my upcoming PLDI paper at the ICML Workshops DL4C and VerifAI!. Type systems are useful at preventing bugs - so why not leverage them for LLMs? Using constrained decoding, we reduce reduce compiler errors of TypeScript code by over 50%! More details in the 🧵.

1

2

5