nlp_mit Profile Banner
MIT NLP Profile
MIT NLP

@nlp_mit

Followers
4K
Following
66
Media
3
Statuses
88

NLP Group at @MIT_CSAIL! PIs: @yoonrkim @jacobandreas @lateinteraction @pliang279 @david_sontag, Jim Glass, @roger_p_levy

Cambridge, MA
Joined March 2025
Don't wanna be here? Send us removal request.
@nlp_mit
MIT NLP
9 months
Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠
27
54
548
@ddvd233
dvd@NeurIPS25
10 days
I will present this work today during - Oral session 5D in Upper Level Ballroom 6CDEF at 10:20 AM - Poster session at 11:00 AM at #1803 I'll also be giving out lab swags (keychains / stickers) during poster session. Feel free stop by and pick up one!
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
7 months
QoQ-Med: Building Multimodal Clinical Foundation Models with Domain-Aware GRPO Training "we introduce QoQ-Med-7B/32B, the first open generalist clinical foundation model that jointly reasons across medical images, time-series signals, and text reports. QoQ-Med is trained with
1
6
48
@_pschro
Philip Schroeder
14 days
Excited to share our NeurIPS 2025 paper introducing our video reasoning framework, ROVER (Reasoning Over VidEo Recursively), that improves visual understanding of VLMs in embodied settings. ROVER is a recursive framework that enables the model to maintain a compact attention
1
4
3
@ddvd233
dvd@NeurIPS25
15 days
在北京野生动物园买到了稳定的卡皮巴拉,希望之后训练模型也能保佑我稳定收敛(
5
4
77
@thecekbote
Chanakya Ekbote
19 days
Ever wondered how LLMs generalize to entirely new patterns? In our Spotlight paper at #neurips2025, we study this in a fully controlled setting and show the minimal transformer architecture needed to learn induction heads. Paper Link: https://t.co/dFnKwmh3uC 🧵👇
1
17
42
@pratyusha_PS
Pratyusha Sharma ✈️ NeurIPS
24 days
📢 Some big (& slightly belated) life updates! 1. I defended my PhD at MIT this summer! 🎓 2. I'm joining NYU as an Assistant Professor starting Fall 2026, with a joint appointment in Courant CS and the Center for Data Science. 🎉 🔬 My lab will focus on empirically studying
104
98
2K
@thecekbote
Chanakya Ekbote
1 month
How do we teach LLMs not just to reason, but to reflect, debug, and improve themselves? We at AWS AI Labs introduce MURPHY 🤖, a multi-turn RL framework that brings self-correction into #RLVR (#GRPO). 🧵👇 Link: https://t.co/3kFjI5mxR5
2
21
31
@shannonzshen
Shannon Shen
1 month
Today's AI agents are optimized to complete tasks in one shot. But real-world tasks are iterative, with evolving goals that need collaboration with users. We introduce collaborative effort scaling to evaluate how well agents work with people—not just complete tasks 🧵
7
52
276
@zhaofeng_wu
Zhaofeng Wu
1 month
Just arrived in Suzhou to present reWordBench at #EMNLP2025. Come to our talk to hear how SOTA reward models can easily break under minor input transformations, and how to fix it! 🗓️ Wed 11/5 🕒 3:00 PM 📍 Safety & Alignment session
@zhaofeng_wu
Zhaofeng Wu
9 months
Robust reward models are critical for alignment/inference-time algos, auto eval, etc. (e.g. to prevent reward hacking which could render alignment ineffective). ⚠️ But we found that SOTA RMs are brittle 🫧 and easily flip predictions when the inputs are slightly transformed 🍃 🧵
2
8
54
@ReeceShuttle
Reece Shuttleworth
2 months
🧵 LoRA vs full fine-tuning: same performance ≠ same solution. Our NeurIPS ‘25 paper 🎉shows that LoRA and full fine-tuning, even when equally well fit, learn structurally different solutions and that LoRA forgets less and can be made even better (lesser forgetting) by a simple
18
244
2K
@gabe_grand
Gabe Grand
2 months
Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost. Paper, code & demos: https://t.co/lV76HRKR3d Here's what we learned about building rational information-seeking
4
34
169
@alanamarzoev
Alana Renda (Marzoev)
2 months
🚨 New paper up on how LLMs reason under uncertainty! 🎲 Many real world uses of LLMs are characterized by the unknown—not only are the models prompted with partial information, but often even humans don't know the "right answer" to the questions asked. Yet most LLM evals focus
6
23
129
@a1zhang
Alex L Zhang
2 months
Lots of folks have been asking for a gist or simple notebook to try out RLMs. While we work on some more exciting experiments, here's a self-contained, minimal version I quickly put together for people to build on top of. Happy hacking :) https://t.co/Lfj97OEvYX
Tweet card summary image
github.com
Super basic implementation (gist-like) of RLMs with REPL environments. - alexzhang13/rlm
5
54
434
@a1zhang
Alex L Zhang
2 months
What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length,
126
357
3K
@nlp_mit
MIT NLP
2 months
new paper! how can we get LLMs to model incorrect student thinking?
@alexisjross
Alexis Ross
2 months
Can LLMs reason like a student? 👩🏻‍🎓📚✏️ For educational tools like AI tutors, modeling how students make mistakes is crucial. But current LLMs are much worse at simulating student errors ❌ than performing correct ✅ reasoning. We try to fix that with our method MISTAKE 🤭👇
0
0
13
@anku__rani
Anku
2 months
🗞️ Dialogues with AI Reduce Beliefs in Misinformation but Build No Lasting Discernment Skills ➡️ While interactions with AI have been shown to durably reduce people’s beliefs in false information, it is unclear whether these interactions also teach people the skills to discern
0
4
14
@nlp_mit
MIT NLP
2 months
catch MIT NLP at @COLM_conf day 1! morning: @gabe_grand is presenting “Self Steering Language Models” @ben_lipkin is presenting “Fast Controlled Generation from Language Models with Adaptive Language righted Rejection Sampling” @KaivuHariharan is presenting “Breakpoint:
0
3
13
@gabe_grand
Gabe Grand
2 months
Good morning @COLM_conf! Excited to present our poster on Self-Steering LMs (#50, 11AM-1PM). If you’re thinking about codegen, probabilistic inference, or parallel scaling, stop by for a chat!
0
7
46
@ishapuri101
Isha Puri @NeurIPS
2 months
flying to 🇨🇦 this week for #COLM2025! catch us on friday to hear our talk about RLCR at the SCALR@COLM workshop. reach out to chat about test time compute, rl for interaction, and anything else!
@ishapuri101
Isha Puri @NeurIPS
4 months
It seems GPT‑OSS is very prone to hallucinations … check out our RLCR paper to see how we trained reasoning models to know what they don't know. Website 🌐 and code 💻 out today! https://t.co/YqLu92enIy 🚀
1
3
33
@MehulDamani2
Mehul Damani
2 months
I will be giving a talk on RLCR at the SCALR@COLM workshop on Friday! Come learn how LLMs can be trained to reason about their own uncertainty. Always happy to chat about RL and related ideas (DMs open)!
@MehulDamani2
Mehul Damani
5 months
🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --
0
5
39
@nlp_mit
MIT NLP
2 months
Exciting new work by @alexisjross @megha_byte on AI + education for code!
@megha_byte
Megha Srivastava
2 months
New preprint on AI + Education! 🍎 “Modeling Student Learning with 3.8M Program Traces” 💻 When students code, their edits tell a story about their reasoning process: exploring, debugging, and tinkering 🧠 What can LMs learn from training on student edit sequences? 📚
0
0
3