
Sumeet Motwani
@sumeetrm
Followers
1K
Following
4K
Media
43
Statuses
295
Research Intern@Microsoft Phi | ML PhD at Oxford, Previously CS at UC Berkeley
Redmond, WA
Joined February 2024
Introducing MALT: Improving Reasoning with Multi-Agent LLM Trainingš«”. We present a new multi-agent post-training method that uses credit assigned synthetic data to improve the reasoning capabilities and self-correction rates of a generator, critic, and refinement model working
13
52
308
RT @ryan_kidd44: MATS 9.0 applications are open! Launch your career in AI alignment, governance, and security with our 12-week research proā¦.
0
53
0
cs 189.
To all undergrads interested in learning about AI: be wary of taking āIntro to AIā as your first AI course. In many programs, the class you actually want first is āIntro to Machine Learningā. AI technology has exploded in the past 15 years thanks to deep neural networks. Yet at
0
0
4
RT @pratyushmaini: 1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today @datologyai sharesā¦.
0
125
0
Glad to see the focus on Information Theory and Cryptography. Probably one of the most important (yet understudied) areas in AI security
New Ā£15,000,000 available for technical AI alignment and security work. International coalition includes UK AISI, Canadian AISI, Schmidt, AWS, UK ARIA. Likely more Ā£ coming in future. šØšØ Please help make sure all potential good applicants know & apply by 10 Sept. šØšØ
0
0
7
RT @VaishShrivas: Test-time scaling w/ GRPO boosts accuracy, but also adds āfiller tokensā increasing length w/o real progress. We presentā¦.
0
48
0
RT @valentina__py: šFor the SoLaR workshop @COLM_conf we are soliciting opinion abstracts to encourage new perspectives and opinions on resā¦.
0
12
0
RT @kuchaev: Everything about Llama-Nemotron-Super-V1.5 post-training is now open:.Synthetic data: Human data: httpā¦.
github.com
Scalable toolkit for efficient model reinforcement - NVIDIA-NeMo/RL
0
49
0
RT @guohao_li: Introducing Eigent ā the first multi-agent workforce on your desktop. Eigent is a team of AI agents collaborating to compleā¦.
0
138
0
RT @prfsanjeevarora: Completely misses the point. Nobody is suggesting that solving IMO problems is useful for math research. The point isā¦.
0
38
0
Given the recent IMO results, OAI seems to have figured out reasoning *reliably* with at least 4 Million tokens.
Also this model thinks for a *long* time. o1 thought for seconds. Deep Research for minutes. This one thinks for hours. Importantly, itās also more efficient with its thinking. And thereās a lot of room to push the test-time compute and efficiency further.
1
0
12
RT @DulhanJay: Come and find me today at #ICML2025 and let's talk about speech š¬ decoding from the brain and scaling brain-computer interfaā¦.
0
3
0
RT @JamesAlcorn94: Plenty of brittle + narrow tooling in this wild west era of codegen can be characterizedānot unfairly, and with just a hā¦.
0
5
0