Yu Su @ysu_nlp X Profile

Yu Su

@ysu_nlp

Followers

12K

Following

4K

Media

137

Statuses

2K

something new | prof. @osunlp | sloan fellow | intelligence and agents | author of Mind2Web, SeeAct, MMMU, HippoRAG, BioCLIP, UGround.

https://t.co/rHNZAqg9s5

Columbus, OH

Joined March 2013

Don't wanna be here? Send us removal request.

Yu Su

@ysu_nlp

2 months

Computer Use: Modern Moravec's Paradox A new blog post arguing why computer-use agents may be the biggest opportunity and challenge for AGI. https://t.co/6fZfTdx710 Table of Contents > Moravec’s Paradox > Moravec's Paradox in 2025 > Computer use may be the biggest opportunity

9

65

207

Yu Su

@ysu_nlp

2 days

I don't necessarily agree with everything Microsoft is doing in the AI space, but man, Satya is such a fantastic CEO: sharp, grounded, long-termist, and still super easy to be around with. I should have held my MSFT for longer.

Dwarkesh Patel

@dwarkesh_sp

2 days

.@satyanadella gave me and @dylan522p an exclusive tour of Fairwater 2, the most powerful AI datacenter in the world. We then chatted through Satya's vision for Microsoft in a world with AGI. 0:00:00 - Fairwater 2 0:04:15 - Business models for AGI 0:13:42 - Copilot 0:20:56 -

1

2

17

Ming Jin (hiring Fall'26 PhDs)

@MingJin80233626

2 days

Excited for this week's AI Agent Frontier Seminar! We're thrilled to host Dr. Huan Sun (@hhsun1 ) from @OhioState. She'll discuss a critical topic: "Advancing the Capability and Safety of Computer-Use Agents Together." 🗓️ Friday, Nov 14 ⏰ 9 AM PT / 12 PM ET All are welcome!

0

2

13

Yu Su

@ysu_nlp

3 days

In my ICLR SAC batch (225 papers), only 18 paper (8%) have an avg initial rating >= 6 (borderline accept). Maybe my batch is not the most representative, or maybe it's reviewer fatigue from the exploding # of submissions. Silver lining is, don't give up on rebuttal.

11

12

238

Huan Sun

@hhsun1

4 days

🚀 Worried about faculty openings? Ohio State @OhioState is to hire 100 new faculty with AI expertise over the next five years! 🤖🎓 The new hires will join one of three AI Faculty Cohorts: 🧠 Foundational AI — Elevating the theoretical, mathematical, and algorithmic

8

42

211

Yu Su

@ysu_nlp

4 days

Super fun to serve on LEAP's academic advisory board and observe the forecasting. If the development of AI can live up to the median forecasts, it will already be a big deal.

Forecasting Research Institute

@Research_FRI

5 days

Today, we are launching the most rigorous ongoing source of expert forecasts on the future of AI: the Longitudinal Expert AI Panel (LEAP). We’ve assembled a panel of 339 top experts across computer science, AI industry, economics, and AI policy. Roughly every month—for the next

0

1

15

Percy Liang

@percyliang

17 days

⛵Marin 32B Base (mantis) is done training! It is the best open-source base model (beating OLMo 2 32B Base) and it’s even close to the best comparably-sized open-weight base models, Gemma 3 27B PT and Qwen 2.5 32B Base. Ranking across 19 benchmarks:

19

83

558

Yueqi Song @ EMNLP2025

@yueqi_song

17 days

We just built and released the largest dataset for supervised fine-tuning of agentic LMs, 1.27M trajectories (~36B tokens)! Up until now, large-scale SFT for agents is rare - not for lack of data, but because of fragmentation across heterogeneous formats, tools, and interfaces.

arxiv.org

Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue...

27

173

1K

Hanane Nour Moussa

@HananeNMoussa

18 days

📢 As AI becomes increasingly explored for research idea generation, how can we rigorously evaluate the ideas it generates before committing time and resources to them? We introduce ScholarEval, a literature grounded framework for research idea evaluation across disciplines 👇!

4

42

141

Yu Su

@ysu_nlp

20 days

The GOAT

Diyi Yang

@Diyi_Yang

20 days

Stanford NLP 25th Anniversary🤩🤩🤩

2

4

117

Yu Su

@ysu_nlp

23 days

This is huge mistake and loss for Meta

Yuandong Tian

@tydsh

23 days

Several of my team members + myself are impacted by this layoff today. Welcome to connect :)

8

4

288

Yu Su

@ysu_nlp

24 days

Genuine question: how is @OpenAI Atlas different from @perplexity_ai Comet? at first glance the set of AI features looks quite similar

1

0

10

Sayash Kapoor

@sayashk

25 days

I am on the faculty job market this year! I am seeking tenure-track faculty positions to drive my research agenda on rigorous AI evaluation for science and policy. I am applying broadly across disciplines, and would be grateful to hear of relevant positions. Materials: 🧵

10

71

420

Yu Su

@ysu_nlp

27 days

I've given 30+ talks on agents in the past 2 years, and I always end my talk with this slide. We are just at the dawn of a long journey on agents. General agents need the same broad set of cognitive competencies as humans and more. It doesn't necessarily have to be constructed

Yu Su

@ysu_nlp

28 days

> @karpathy is right on the limitations of current agents > 2025 is year 0 of the decade of agents: agents started to bring significant marginal value in narrow domains (eg, coding, customer service), but still far from general human-level competency > agents need a wholesale

6

13

131

Yu Su

@ysu_nlp

28 days

R.I.P.

Tsinghua University

@Tsinghua_Uni

28 days

Prof. Chen Ning Yang, a world-renowned physicist, Nobel Laureate in Physics, Academician of the Chinese Academy of Sciences, Professor at Tsinghua University, and Honorary Director of the Institute for Advanced Study at Tsinghua University, passed away in Beijing due to illness

0

14

Yu Su

@ysu_nlp

28 days

> @karpathy is right on the limitations of current agents > 2025 is year 0 of the decade of agents: agents started to bring significant marginal value in narrow domains (eg, coding, customer service), but still far from general human-level competency > agents need a wholesale

NIK

@ns123abc

28 days

BREAKING: Andrej Karpathy calls out Sam Altman Altman: >"We are now confident we know how to build AGI" >"2025 is the year of AI agents" Karpathy: >"I was triggered by that over-prediction" >"More accurately, it's the decade of agents" >"There's SO much work to be done" Hmm

3

5

69

Yu Su

@ysu_nlp

28 days

Excited to introduce a new agent learning paradigm called Early Experience, as a reward-free mid-training stage for large-scale agent training. A fantastic collaboration between Meta Superintelligence Lab and @osunlp led by the amazing @KaiZhang_CS. Built on insights from our

arxiv.org

Language agents based on large language models (LLMs) have demonstrated great promise in automating web-based tasks. Recent work has shown that incorporating advanced planning algorithms, e.g.,...

Jason Weston

@jaseweston

29 days

🌀Agent Learning via Early Experience🌀 📝: https://t.co/VsqQHTTrBN - SFT for agents is sparse; RL on long-horizons is hard We provide new mid-training signals that work: 1) Implicit next state world modeling task 2) Self-reflection on alternate states - Strong improvements over

5

29

175

Kai Zhang

@KaiZhang_CS

28 days

Introducing early experience: using future states resulting from agent’s own action as scalable supervision to train itself - without reward🧠! 1️⃣Reward-free: can train directly in real-world environments. 2️⃣Better RL warm-start: when continued with RL, leads to higher final

Jason Weston

@jaseweston

29 days

🌀Agent Learning via Early Experience🌀 📝: https://t.co/VsqQHTTrBN - SFT for agents is sparse; RL on long-horizons is hard We provide new mid-training signals that work: 1) Implicit next state world modeling task 2) Self-reflection on alternate states - Strong improvements over

2

28

109

Dawn Song

@dawnsongtweets

29 days

Really excited to announce AgentX–AgentBeats Competition 🚀 💰 $1 Million+ in prizes, cloud credits, and API resources, a global challenge hosted by @BerkeleyRDI , building on the Agentic AI MOOC community of 32K+ learners, bringing together builders, researchers, engineers, and

11

28

84

Yu Su

@ysu_nlp

1 month

The most comprehensive agent evaluation to date

Sayash Kapoor

@sayashk

1 month

📣New paper: Rigorous AI agent evaluation is much harder than it seems. For the last year, we have been working on infrastructure for fair agent evaluations on challenging benchmarks. Today, we release a paper that condenses our insights from 20,000+ agent rollouts on 9

1

13

131

Rohan Paul

@rohanpaul_ai

1 month

New @GoogleResearch paper shows agents learn software skills by watching tutorials, converting them into action steps, and boosting task performance. So converts free videos into reliable supervision at scale. A vision model, inverse dynamics, predicts the action between 2

14

82

441