ethayarajh Profile Banner
Kawin Ethayarajh Profile
Kawin Ethayarajh

@ethayarajh

Followers
4K
Following
2K
Media
76
Statuses
1K

Assistant Professor @UChicago @ChicagoBooth. PhD @StanfordAILab @stanfordnlp.

Palo Alto, California
Joined March 2019
Don't wanna be here? Send us removal request.
@ethayarajh
Kawin Ethayarajh
7 days
šŸ“¢ Belated update, but I'm thrilled to share that I've joined @UChicago @ChicagoBooth as an Assistant Professor in the newly created Applied AI group!.I'll continue to work on behavior-bound machine learning: understanding how AI shapes, is shaped, and should be shaped by the
Tweet media one
Tweet media two
56
27
610
@ethayarajh
Kawin Ethayarajh
4 days
RT @YejinChoinka: Honored to be back on TIME100 AI for 2025 — alongside my longtime heroes @drfeifei and @BarzilayRegina! šŸ˜. The recognitio….
0
35
0
@grok
Grok
19 days
Blazing-fast image creation – using just your voice. Try Grok Imagine.
284
564
3K
@ethayarajh
Kawin Ethayarajh
4 days
RT @PeterHndrsn: I'm starting to get emails about PhDs for next year. I'm always looking for great people to join!. For next year, I'm look….
0
28
0
@ethayarajh
Kawin Ethayarajh
5 days
RT @Diyi_Yang: Introducing ✨Generative Interfaces where LLMs respond to users by proactively generating UIs that enable adaptive interactio….
0
39
0
@ethayarajh
Kawin Ethayarajh
6 days
RT @Muennighoff: Can AI solve open problems in math, physics, coding, medical sciences & beyond?. We collected unsolved questions (UQ) & te….
0
244
0
@ethayarajh
Kawin Ethayarajh
7 days
2019 - 2024 was such a bizarre time to be doing a PhD in an NLP group, having first entered the field when word embeddings were just becoming a thing; it felt like I aged 20 years in 5. I’m immensely grateful for having had the support of my advisor @jurafsky ,.
1
0
14
@ethayarajh
Kawin Ethayarajh
7 days
Some questions I'll be working on:.- When we think about personalizing LLMs, we only think about changing the data, even though the objective itself encodes a utility function and different people have different utility functions (. What does.
Tweet card summary image
arxiv.org
Kahneman & Tversky's $\textit{prospect theory}$ tells us that humans perceive random variables in a biased but well-defined manner (1992); for example, humans are famously loss-averse. We show...
1
0
15
@ethayarajh
Kawin Ethayarajh
11 days
RT @JeffDean: AI efficiency is important. Today, Google is sharing a technical paper detailing our comprehensive methodology for measuring….
0
840
0
@ethayarajh
Kawin Ethayarajh
15 days
h/t @m_sendhil and @tedsumers for the recommendation!.
0
0
0
@ethayarajh
Kawin Ethayarajh
15 days
4. Rats systematically test out hypotheses one by one to find one that works; LLMs use reasoning traces to explore possible solutions, retreating and branching off as needed. "When this was done, Krech found that the individual rat went through a succession of systematic
Tweet media one
1
0
2
@ethayarajh
Kawin Ethayarajh
15 days
3. Shocks need to be verifiable for rats to best learn; RL for LLMs works best when the reward is verifiable. "But the particular finding which I am interested in now appeared as a result of a modification of this standard procedure. Hudson noticed that the animals,.
1
0
1
@ethayarajh
Kawin Ethayarajh
15 days
2. Rats need to discover the instruction for reinforcement to work; LLMs need instruction-tuning before RL. "To sum up, in *visual discrimination* experiments the better the learning, the more the VTE's. But this seems contrary to what we would perhaps have expected. We
Tweet media one
1
0
1
@ethayarajh
Kawin Ethayarajh
15 days
Reading Tolman's "Cognitive Maps in Rats and Men" (1948) and it's remarkable how well findings in rats explain the contemporary pre-training post-training split in LLMs. 1. Rats do latent learning even when there is no reward, and this primes them to learn rapidly when rewards
Tweet media one
1
0
12
@ethayarajh
Kawin Ethayarajh
17 days
> Right now, there is no established RL algorithm to give a model verbal feedback and have it update it's weights. There is also no established algorithm for a model to reflect on a previous failed execution and update it's own weights. I think there are? This doesn't seem like.
@AlexGDimakis
Alex Dimakis
18 days
Imagine you're trying to teach a human how to do a task, say install Windows XP in a virtual machine. The human walks into a room and sees a document (prompt) that you have written, that describes exactly what they are supposed to do. There is also a computer ready for their
Tweet media one
2
1
18
@ethayarajh
Kawin Ethayarajh
17 days
RT @oshaikh13: If you thought referencing past chats was cool, we built an MCP that lets Claude use *anything you see or do on your compute….
0
32
0
@ethayarajh
Kawin Ethayarajh
23 days
The problem is that the token economics aren’t sustainable:. 1. If you have too many models (e.g., 4o, o3, _and_ all the GPT-5 variants), then the cost of inference explodes: amortizing over very large batches is necessary to make any one model worth hosting. But a large minority.
@Yuchenj_UW
Yuchen Jin
23 days
ChatGPT users are canceling subscriptions. GPT-5 as the main source of intelligence does not work. Intelligence isn’t one-size-fits-all; user needs and preferences vary wildly. We want more models. We want more control.
Tweet media one
2
2
26
@ethayarajh
Kawin Ethayarajh
24 days
RT @sayashk: How does GPT-5 compare against Claude Opus 4.1 on agentic tasks? . Since their release, we have been evaluating these models o….
0
70
0
@ethayarajh
Kawin Ethayarajh
25 days
RT @nabeelqu: The 'vibe shift' on here is everyone realizing they will still have jobs in 2030.
0
439
0
@ethayarajh
Kawin Ethayarajh
1 month
RT @CAAI_Booth: Joining @ChicagoBooth's efforts to invest in #AI as it shapes business, markets, and institutions are three new Applied AI….
Tweet card summary image
chicagobooth.edu
Booth welcomes three new professors working at the intersection of AI, technology, business, and society.
0
3
0
@ethayarajh
Kawin Ethayarajh
2 months
RT @charles_irl: In a new blog post for @modal_labs, I argue against the prevailing denomination of LLM services in terms of dollars per to….
0
4
0