
Jessy Lin
@realJessyLin
Followers
2K
Following
700
Media
36
Statuses
300
PhD @Berkeley_AI, visiting researcher @AIatMeta. Interactive language agents 🤖 💬
Joined March 2013
I’ll be at #ICLR2025 this week! ✈️ A couple of things I’m excited about lately: . 1) Real-time multimodal models: how do we post-train assistants for real-time (and real world) tasks beyond the chat box?. 2) Continual learning and memory: to have models / agents that learn from.
8
4
93
RT @geoffreylitt: # Enough AI copilots! We need AI HUDs. IMO, one of the best critiques of modern AI design comes from a 1992 talk by the r….
0
47
0
RT @ilyasut: The Bitter lesson does not say to not bother with methods research. It says to not bother with methods that are handcrafted d….
0
47
0
RT @nlpxuhui: 💯 Can't wait for the second blog!. This could be an important step towards making AI agents more "human-centered". We want….
0
1
0
underrated idea to learn passively about people from everyday computer use - I think the natural extension is learning from *trajectories* of how people prefer to do things, which is hard to get from prompting / static user data otherwise.
What if LLMs could learn your habits and preferences well enough (across any context!) to anticipate your needs?. In a new paper, we present the General User Model (GUM): a model of you built from just your everyday computer use. 🧵
1
3
14
Super interesting - imo sparse updates will be an important ingredient for continually learning agents, and it seems this is already a surprising / unintentional side effect of RL.
🚨 Paper Alert: “RL Finetunes Small Subnetworks in Large Language Models”. From DeepSeek V3 Base to DeepSeek R1 Zero, a whopping 86% of parameters were NOT updated during RL training 😮😮.And this isn’t a one-off. The pattern holds across RL algorithms and models. 🧵A Deep Dive
1
0
25
RT @jyangballin: 40% with just 1 try per task: SWE-agent-LM-32B is the new #1 open source model on SWE-bench Verified. We built it by synt….
0
133
0
RT @SGRodriques: Today, we are launching the first publicly available AI Scientist, via the FutureHouse Platform. Our AI Scientist agents….
0
708
0
RT @hlntnr: New on Rising Tide, I break down 2 factors that will play a huge role in how much AI progress we see over the next couple years….
0
18
0
RT @cassidy_laidlaw: We built an AI assistant that plays Minecraft with you. Start building a house—it figures out what you’re doing and ju….
0
217
0
RT @sanidhya903: 1/ LLM agents can code—but can they ask clarifying questions? 🤖💬.Tired of coding agents wasting time and API credits, only….
0
17
0
RT @boazbaraktcs: Fascinating interviews. I'm not sure humans will ever be "out of the loop" in math. Even if humans have no advantages in….
0
1
0
RT @sea_snell: Can we predict emergent capabilities in GPT-N+1🌌 using only GPT-N model checkpoints, which have random performance on the ta….
0
71
0
+1 to the key idea here - it's def important to iterate on algorithms with clean benchmarks like math+code with known reward functions, but almost every task we care about in the real world has a fuzzy / human-defined reward func. I'm interested to see how we'll end up applying.
i wrote a new essay called. The Problem with Reasoners. where i discuss why i doubt o1-like models will scale beyond narrow domains like math and coding (link below)
1
2
30
With agents, AI search, and chat becoming some of the main ways people interact with the web, I wrote a post about how human interfaces and agent-computer interfaces might co-evolve:.
jessylin.com
3
13
60