
Taylor W. Killian
@tw_killian
Followers
2K
Following
79K
Media
680
Statuses
6K
Senior Research Scientist @MBZUAI, interested in Decision Making & Generalization // @BYU '13; @Harvard '17; @UofT '24
Bay Area, CA
Joined September 2015
Day 1 with @mbzuai as we start building out a new non-profit, open research lab in the Bay Area. Lots to learn and my head is spinning but I’m excited to get started pushing the boundaries of what we know about reasoning and decision making under uncertainty.
15
8
214
RT @ChengZhoujun: A thoughtful work on the effects of SFT and RL from a transferability perspective. It's also encouraging to see more work….
0
7
0
RT @ccui9: Research in RL for LLMs is the true key to work-life balance. You just go touch grass while waiting for training to finish. Af….
0
1
0
RT @rohanpaul_ai: LLM reasoning with reinforcement learning focuses on limited domains, hindering general applicability. This paper develo….
0
10
0
RT @alexgshaw: Excited to team up with @andykonwinski on Laude Institute, his next endeavor to normalize bringing research breakthroughs in….
0
1
0
RT @ChengZhoujun: Thanks for the suggestion! We actually tried Llama-3.1-8B, but found high "instruction-following costs" - the base model….
0
5
0
This is a fascinating opportunity working at the forefront of emergence of autonomous behaviors with one of the best in the field (both technically and as a person). If you're motivated to get RL into the real-world, this is a great chance to do so!.
We now know RL agents can zero-shot crush driving benchmarks. Can we put them on a car and replace the planning stack? We're hiring a postdoc at NYU to find out! .Email me if interested and please help us get the word out.
1
0
3
RT @ZhitingHu: 🔥Reinforcement learning for LLM reasoning is emerging—but many questions remain🧐🧐. ❓ Does RL teach new reasoning, or just el….
0
25
0
This was the first milestone of our very productive collaboration between @mbzuai @llm360 and the fantastic group of students led by @ChengZhoujun @Ber18791531 and @LtyLeoii22. We've got a lot of RL cooking in the background, more to share in the future!.
🤯What we know about RL for reasoning might not hold outside math and code?. We revisit established findings on RL for LLM reasoning on six domains (Math, Code, Science, Logic, Simulation, Tabular) and found that previous conclusions drawn on math and code are surprisingly
1
2
28
RT @ChengZhoujun: 🤯What we know about RL for reasoning might not hold outside math and code?. We revisit established findings on RL for LLM….
0
55
0
RT @_akhaliq: Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
0
52
0
RT @iScienceLuvr: Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective. "We introduce GURU, a curated RL rea….
0
46
0
RT @Mike_A_Merrill: I love how counterintuitive rigorous empirical research can be. We found that the best models (R1) aren't necessarily….
0
3
0
RT @llm360: KV-caching is great, but will it work for Diffusion Language Models. @zhihanyang_ and team showed how to make it work with 65….
0
7
0
RT @zhihanyang_: 📢Thrilled to share our new paper: Esoteric Language Models (Eso-LMs). > 🔀Fuses autoregressive (AR) and masked diffusion (M….
0
62
0
RT @EugeneVinitsky: There are 6 billion people in the world. There are 300 million Americans. Statistically, more talent is outside the US….
0
2
0
RT @EugeneVinitsky: Attempting to destroy the entire scientific apparatus because you were mad about some university protests.
0
2
0
RT @rupspace: With the recent results on Qwen+RL, I hope more people take seriously what some of us have been harping on:.- Open source >>>….
0
1
0