nielstron Profile Banner
Niels Mündler (@ ICML) Profile
Niels Mündler (@ ICML)

@nielstron

Followers
597
Following
3K
Media
191
Statuses
1K

CS PhD @eth. Language Models, Code, Formal verification. Compiling Python to FP @OpShinDev. Ex-Founder.

Switzerland
Joined November 2012
Don't wanna be here? Send us removal request.
@nielstron
Niels Mündler (@ ICML)
6 days
And this is why benchmark scores keep increasing but my GitHub issues still require manual labor
Tweet media one
0
0
1
@nielstron
Niels Mündler (@ ICML)
9 days
RT @mbalunovic: We've just released the largest open dataset of expert-annotated LLM proofs! Using this dataset, we did bunch of experiment….
0
4
0
@nielstron
Niels Mündler (@ ICML)
12 days
RT @ni_jovanovic: There's a lot of work now on LLM watermarking. But can we extend this to transformers trained for autoregressive image ge….
0
54
0
@nielstron
Niels Mündler (@ ICML)
18 days
PLDI'25 is now and I am looking forward to present "Type-Constrained Code Generation with Language Models" in 2 hours (4pm Seoul Time). If you can't make it, feel free check out this thread, which summarizes the key points, or chat me up about any questions.
@nielstron
Niels Mündler (@ ICML)
2 months
Excited to present my upcoming PLDI paper at the ICML Workshops DL4C and VerifAI!. Type systems are useful at preventing bugs - so why not leverage them for LLMs? Using constrained decoding, we reduce reduce compiler errors of TypeScript code by over 50%! More details in the 🧵.
0
1
4
@nielstron
Niels Mündler (@ ICML)
1 month
Yudkowsky scream going from ultrasonic to. .
@SakanaAILabs
Sakana AI
1 month
Introducing The Darwin Gödel Machine: AI that improves itself by rewriting its own code. The Darwin Gödel Machine (DGM) is a self-improving agent that can modify its own code. Inspired by evolution, we maintain an expanding lineage of agent variants,
Tweet media one
0
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
I am beyond humbled to see my paper there! glad that people like it :).
0
0
2
@nielstron
Niels Mündler (@ ICML)
2 months
so apparently non-big-AI-lab papers *can* trend on Hacker News too.
@cameron_pfiffer
Even Better Cameron Pfiffer 📎
2 months
Suspicious. We put out this tweet thread and then we have the paper on the cover of hn? Correlation, or causation. You decide
Tweet media one
1
1
6
@nielstron
Niels Mündler (@ ICML)
2 months
RT @dottxtai: Very cool paper. @jingxuan_he.
0
5
0
@nielstron
Niels Mündler (@ ICML)
2 months
RT @mark_veroe: If you are at ICLR, come by our posters in the DL4C (Garnet 218-219) and BuildingTrust (Hall 4 #6) workshops. We have now e….
0
3
0
@nielstron
Niels Mündler (@ ICML)
2 months
8/ This work was done in collaboration with my co-lead @jingxuan_he and @MogicianTony @dawnsongtweets @koushik77 @mvechev jointly between @the_sri_lab.and @UCBerkeley. We will present it at ICLR 25 workshops DL4C and VerifAI in Singapore! Excited to discuss it there :).
0
0
2
@nielstron
Niels Mündler (@ ICML)
2 months
7/ Most importantly, this method is orthogonal to model scaling - it can be applied on top of however-good models there are. We also see that even SOTA GPT-4o, o1 and R1 make type errors in more complex tasks and languages such as Rust.
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
6/ The work pays off! For Gemma 2 27B, Qwen 2.5 and DeepSeek Coder we see a drop of around 80% in compiler errors.
Tweet media one
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
5/ Most difficult is deciding whether a partial expression *can* be completed to have a specific type. We implement a search over possible completion types and a heuristic to prune the search space effectively.
Tweet media one
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
4/ We introduce prefix automata to enable reasoning over *eventual* type-safety. These automata track all possible ASTs of a partial program as states. They guarantee reachability of an accepting state (i.e., type-safe program) as long as one AST is found.
Tweet media one
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
3/ We propose leveraging the type system to constrain generation. We prevent sampling tokens from the LLM that violate type-safety. This is hard, appropriate constraints have to be built from scratch for every language. We choose TypeScript as a target due to its popularity.
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
2/ Prior work demonstrated using constrained decoding to guarantee syntactic correctness, or enforce simple constraints such as correct SQL column names. But syntax errors make up only 3.5% of all code errors in our experiments, and only 6% of compiler errors.
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
1/ For all the details, check out the paper at We provide a detailed explanation of our constraining approach and extensive evaluation of 6 SOTA open-weight Code-LLMs - which all improve through our method!.
1
0
1
@nielstron
Niels Mündler (@ ICML)
2 months
Excited to present my upcoming PLDI paper at the ICML Workshops DL4C and VerifAI!. Type systems are useful at preventing bugs - so why not leverage them for LLMs? Using constrained decoding, we reduce reduce compiler errors of TypeScript code by over 50%! More details in the 🧵.
1
2
5