Sebastian Pokutta
@spokutta
Followers
1K
Following
1K
Media
52
Statuses
850
if rising_edge(clk) then: Artificial Intelligence, Optimization, and Machine Learning. Prof @TUBerlin, Vice President @ZuseInstitute else: yield.
Berlin, Germany
Joined June 2009
I used ChatGPT to solve an open problem in convex optimization. *Part III* 1/N https://t.co/6HDAr6y8Z9
7
49
403
When we worked on the Cell to Sentence paper two years ago, it was hard to imagine this result. Back then, we were only training tiny models like the ones you fine here https://t.co/hZ2DrXLsDO By scaling the model to 27b parameters and more data, the same method generated a
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests,
0
5
34
Our book on conditional gradients and Frank-Wolfe methods with Gábor Braun, @ACarderera, @CyrilleCmbt, Hamed Hassani, @aminkarbasi and Aryan Mokhtari just appeared in the MOS-SIAM Series on Optimization #ml #optimization #ai
https://t.co/VMNrLTv8ZO
epubs.siam.org
0
2
16
New blog post: committing to secrets via hashing - a short, first-principles introduction #crypto #hashing => https://t.co/4iqWEnDGvt
1
0
0
Our paper "Computational Algebra with Attention: Transformer Oracles for Border Basis Algorithms" has been accepted to #NeurIPS2025! I will most likely be at EurIPS in Copenhagen :) arXiv: https://t.co/mWL4Tk4v5l coauthors: @HiroshiKera, N. Pelleriti, Y. Ishihara, @spokutta
0
2
7
12/ TL;DR: While this model is simplified, maybe it is time to rethink acceptance rates as a measure to tackle the chaos in ML/AI publishing. Lowering acceptance rates alone seems the wrong lever. The math is unforgiving: #accepted_papers ≈ arrivals, backlog scales with 1/p.
0
0
3
11/ This might provide an alternative perspective to the current discussion around NeurIPS 2025’s (senior) area chairs having to "recalibrate their pool" (potentially rejecting good papers to hit the target rate): we might simply be kicking the can further down the road 🤯
1
0
3
10/ 👉 True "acceptance rate" is not p but much higher because authors resubmit and pool is backlogged, more like ≈ 1 - (1-p)^T with patience T 👉 Core quantity is pT as 1 - (1-p)^T ≈ 1 - e^{-pT}, i.e., lower p can be traded with higher T (within reasonable ranges)
1
0
0
9/ 👉 Idealized case = no giving up. Acceptances track arrivals, not arbitrary per-call acceptance rate. Lowering p (beyond reasonable point) lets **review load explode**. 👉 Non-idealized case = finite patience. Lower p still hits average papers hardest, while inflating costs.
1
0
0
8/ Little’s Law view: acceptance per call ≈ arrivals N, **independent of p**. In reality p controls the **pool size L**, i.e. the burden on reviewers.
1
0
0
7/ Translation: If you cut acceptance from 30% to 20%, you still accept N papers/call at steady-state. The only difference? Reviewers now face ~ N/0.2 instead of ~ N/0.3. Especially at moderately low p the gradient is brutal. In the complex model: slight effect on final mix 📈
1
0
0
6/ Check this out: Two setups with different accept rates, backlogs but #accepted_papers is the same (for the math + visualization see: https://t.co/rdf4rSUmjW)
1
0
1
5/ No giving up case: * W = 1/p calls avg waiting time * L = N/p papers under review * λ = N #accepted_papers per call 👉 #accepted_papers stays at N **independent of p**. Lower p ≠ more selectivity; just blows up the backlog. Math with finite patience T is not much harder.
2
0
0
4/ The described setup is literally a queue: new ones enter, resubmitted ones just keep sitting in the queue: Little’s Law applies: L = λ W * L = avg papers in system (the “paper pool”) * λ = throughput (# accepted per call) * W = avg waiting time (# calls a paper is in system)
1
0
0
3/ Imagine each call = one “service round.” * N new papers arrive * Each paper has a per-call acceptance hazard p * Authors keep resubmitting until accepted in the post: N can also increase, different p for different classes, also giving up at some point
1
0
0
2/ We simplify in this thread. For the full version with different paper classes (bad, average, great) and simulation see the full post on David's blog: https://t.co/fBatvixCii
damaru2.github.io
Welcome to my academic site.
1
0
0
1/ Why lowering conference acceptance rates doesn’t change #accepted_papers (but explodes reviewer workload). A short queueing story about ML/AI conferences, resubmissions, and Little’s Law #neurips #ml #ai @montreal_ai #icml @MarkSchmidtUBC @roydanroy 🧵
2
2
15
It’s hard to overstate the significance of this. It may end up looking like a “moon‑landing moment” for AI. Just to spell it out as clearly as possible: a next-word prediction machine (because that's really what it is here, no tools no nothing) just produced genuinely creative
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
61
162
1K
My brilliant student @AnayMehrotra and amazing post-doc Alkis Kalavasis just won the Best Paper Award at COLT 2025! Clearly, they’ve cracked the code: do great math, write a killer paper, take home the trophy. Huge congratulations to both of them. Now please don’t forget your
arxiv.org
Most of the widely used estimators of the average treatment effect (ATE) in causal inference rely on the assumptions of unconfoundedness and overlap. Unconfoundedness requires that the observed...
13
11
179
🎉Our paper "Neural Discovery in Mathematics: Do Machines Dream of Colored Planes?" was selected for an ORAL (top 1%) presentation at #ICML25! 🎉 arXiv: https://t.co/VkFjsaIFdx blog: https://t.co/PdqhvtrY6o 🧵: coming soon
1
2
6