Zizhao Chen
@ch272h
Followers
90
Following
2K
Media
8
Statuses
35
陈梓昭 undergrad @uoftengineering phding @cornell_cs, and is actually elsewhere
Joined July 2014
Pushed a big update to LM-class (v2025.2) -- this second version makes a much more mature resource Many refinements of lecture slides + significant improvements to the assignments Many thanks to @ch272h @HuaYilun and @shankarpad8 for their work on the assignments
1
6
23
🚨Modeling Abstention via Selective Help-seeking LLMs learn to use search tools to answer questions they would otherwise hallucinate on. But can this also teach them what they know vs not? @momergul_ introduces MASH that trains LLMs for search and gets abstentions for free!
1
22
36
ToddlerBot 2.0 is released🥳! Now Toddy can also do cartwheels🤸! We have added so many features since our first release in February; see https://t.co/t9U7qysMe9 for more details. Threads🧵(1/n)
14
49
242
Why scaling data annotators at high costs, when you can scale users for free?
The talk for our work on Retrospective Learning from Interactions, which will be in ACL (once I figure out how to squeeze it shorter) Gist: autonomous post-training from conversational signals for LLM bootstrapping ... look ma, no annotations! 🙌📈🚀 https://t.co/lYkJaukxUt
0
0
7
Time to democratize humanoid robots! Introducing ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. Watch two ToddlerBots seamlessly chain their loco-manipulation skills to collaborate in tidying up after a toy session. https://t.co/tIrAUCbzNz
31
112
579
NeurIPS acknowledges that the cultural generalization made by the keynote speaker today reinforces implicit biases by making generalisations about Chinese scholars. This is not what NeurIPS stands for. NeurIPS is dedicated to being a safe space for all of us. We want to address
530
381
3K
Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡
181
804
4K
And disclaimer - this is absolutely not affiliated with neurips.
0
0
0
I’m not sure what conclusion I can draw from this poll. Credit goes to everyone participated in this mini poll. Thank you - you made my day!
1
0
1
The most common follow up was: it depends on your definition of intelligence, to which I replied “by your definition of intelligence.”
1
0
0
Extra comments: “Very stupid” “Language models? Definitely!” “It’s not a yes/no question” “Yes… if they saw that in training data” “Not true intelligence” “AIs have no heart” “Some are intelligent and some aren’t. Just like humans” “I don’t have money to test it out”
1
0
1
So I was volunteering today. I randomly prompted folks this question when they collected their neurips thermos: Do you think AIs today are intelligent? Answer with yes or no. Here is the break down: Yes: 57 No: 62 Total: 119 Pretty close!
2
0
2
Title: Retrospective Learning from Interactions Website: https://t.co/VuwXVEC6SI Paper: https://t.co/qCiX3pthvx Demo: https://t.co/IIril7neTP With @momergul_, Vivian Chen, Gloria Geng, @anne_youw and @yoavartzi 6/7
huggingface.co
1
0
4
Learning from human-AI deployment interactions - The sky's the limit! Initially, MTurk workers said: “Painful” “This one was heading for total disaster” By the end: “Almost perfect" “Excellent bot that understood every description, even tricky ones, on the first attempt” 5/7
1
0
2
We experiment with an abstract multi-turn generalization of reference games. After 6 rounds of grounded continual learning, the human-bot games success rate improves 31→82%📈 - an absolute improvement of 51%, all without any external human annotations! 🚀 4/7
1
0
2
How do we decode the reward? Implicit feedback occupies a general and easy-to-reason-about subspace of language → We prompt the same LLM that does the task (really bad early on) with a task-independent prompt → LLM bootstraps itself 3/7
1
0
2
Our recipe for learning requires no annotation and no interaction overhead: 🎮 Interact: deploy the LLM to interact with humans 💭 Retrospect: LLM asks itself “Was my response good given what came after in the interaction” to decode rewards 🤑 Learn and repeat 2/7
1
0
2
me: let’s start with a meme @yoavartzi: how about the paper’s fig1? 🙅 me: lesson learned. no memes 😭 A paper 🧵 on continually learning from naturally occurring interaction signals, such as in the hypothetical conversation above https://t.co/qCiX3pthvx 1/7
1
10
42