Zora Wang
@ZhiruoW
Followers
2K
Following
1K
Media
41
Statuses
323
PhD student @LTIatCMU | fun 👩🏻💻 🐈 💃 🪴 🎶
Pittsburgh, PA
Joined August 2021
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
7
52
243
We use LLMs for everyday tasks—research, writing, coding, decision-making. They remember our conversations, adapt to our needs and preferences. Naturally, we trust them more with repeated use. But this growing trust might be masking a hidden risk: what if their beliefs are
16
73
359
New eval! Code duels for LMs ⚔️ Current evals test LMs on *tasks*: "fix this bug," "write a test" But we code to achieve *goals*: maximize revenue, cut costs, win users Meet CodeClash: LMs compete via their codebases across multi-round tournaments to achieve high-level goals
29
91
367
Ever used a top-scoring coding agent that still couldn't help with your actual task? How can we transform agents from "interns who never learn" to "workers who improve over time"? Read about our new approach: ToM-SWE 🪄
Hoping your coding agents could understand you and adapt to your preferences? Meet TOM-SWE, our new framework for coding agents that don’t just write code, but model the user's mind persistently (ranging from general preferences to small details) arxiv: https://t.co/uznLAjgWKr
0
4
18
This paper makes an important point. AI agents don’t naturally follow human workflows — they often default to coding everything, even when humans would reason, sketch, or explore first. This can make them fast, but also awkward in tasks that rely on interpretation or judgment.
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
0
2
13
Agent trajectories are often left underexplored while showcasing rich behaviors. Really appreciate this offering a thoughtful snapshot of human–agent comparisons, reading like a compact and detailed GDPEval.
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
0
2
9
Analyzing agent trajectories has become a new canonical way for evaluation. Yet most trajectories operate at the action level - dense but semantically shallow. We define 𝙬𝙤𝙧𝙠𝙛𝙡𝙤𝙬 as a higher-level abstraction and release a toolkit to induce it from raw actions for both
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
2
12
122
We recorded a bunch of people actually working on their computers (!) and then compared agent performance to actual human workflows. Awesome paper led by @ZhiruoW :)
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
3
6
45
Check out our new work on examining how human and AI workers perform tasks! We get human and AI workers to perform the same tasks, extract workflows, and get insights about where agents do well and where they still have a ways to go.
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
1
5
59
@ZhiruoW 's research compares AI agents vs humans across real work tasks (data analysis, engineering, design, writing). Key findings: 👉Agents are 88% faster & 90-96% cheaper 👉BUT produce lower quality work, often fabricate data to mask limitations 👉Agents code everything,
Agents are joining us at work -- coding, writing, design. But how do they actually work, especially compared to humans? Their workflows tell a different story: They code everything, slow down human flows, and deliver low-quality work fast. Yet when teamed with humans, they shine
1
15
107
Check out our: 📄 Paper: https://t.co/R0v58gxwq3 🧩 Workflow induction tool: https://t.co/CjwTb3sZhg With the fantastic team @EchoShao8899 @oshaikh13 @dan_fried @gneubig @Diyi_Yang at @LTIatCMU & @stanfordnlp
github.com
A toolkit to induce interpretable workflows from raw computer-use activities. - zorazrw/workflow-induction-toolkit
3
3
21
As agents keep climbing up that "autonomy slider" We'll need: - stronger visual understanding - better action calibration, and - workflow-inspired agent designs
1
0
11
Quality? Still shaky. Agents fabricate data, misuse tools -- but finish tasks 88% faster and at a fraction of the cost. Maybe the question isn’t agents vs humans.. But how do we team humans & agents?
1
0
9
AI doesn’t always speed us up. 🧠 Augmentation → +24.3% faster, minimal disruption. ⚙️ Automation → -17.7% slower, workflows reshaped by verification & debugging.
1
0
9
Our findings reveal a striking divide: Agents approach all work programmatically -- even open-ended, visual tasks like design. Humans rely on UI-centric, perceptual interaction. Their workflows may look similar at a high level, but low-level processes are worlds apart.
1
0
12
We induce workflows to: - uniformly represent diverse work activities - enable systematic comparison across workers
1
0
7
We directly compared humans and AI agents. Across computer-use jobs. Via 5 essential work skills: data analysis, engineering, computation, writing, and design.
1
0
11
Very excited to see Zora and her work recognized! Check out her research on skill learning for LLM agents, memory, and programmatic approaches to real-world tasks (even beyond code).
🎉Thrilled to be named a Google PhD Fellow! Feeling incredibly grateful for my advisors, collaborators, and friends who’ve supported and inspired me along the way 🫶 Looking forward to the journey ahead!
0
2
14
Congrats to @ZhiruoW on getting chosen as a Google PhD fellow! If you haven't checked out Zora's work on agents that learn from experience this is a good excuse to do so 😄
🎉Thrilled to be named a Google PhD Fellow! Feeling incredibly grateful for my advisors, collaborators, and friends who’ve supported and inspired me along the way 🫶 Looking forward to the journey ahead!
1
8
81
🎉Thrilled to be named a Google PhD Fellow! Feeling incredibly grateful for my advisors, collaborators, and friends who’ve supported and inspired me along the way 🫶 Looking forward to the journey ahead!
🎉 We're excited to announce the 2025 Google PhD Fellows! @GoogleOrg is providing over $10 million to support 255 PhD students across 35 countries, fostering the next generation of research talent to strengthen the global scientific landscape. Read more: https://t.co/0Pvuv6hsgP
20
33
426