Ryan Teehan
@rteehas
Followers
288
Following
5K
Media
13
Statuses
241
PhD Student @nyuniversity | prev. @carperai, @stabilityai | prev. @uchicago @TTIC_Connect
Joined May 2022
We will be presenting our work on steering and scaling diffusion models at #ICML2025! Bonus: - FK steering can beat grad guidance?! 🤯 - Boltz 🧬adapted FK steering to BioML!!! https://t.co/DhOqOfbi27 East Ex. Hall A-B #E-1308 (Tue 15 Jul 11 a.m. PDT — 1:30 p.m. PDT)
Got a diffusion model? What if there were a way to: - Get SOTA text-to-image prompt fidelity, with no extra training! - Steer continuous and discrete (e.g. text) diffusions - Beat larger models using less compute - Outperform fine-tuning - And keep your stats friends happy !?
2
12
79
This is poetry, not prose, but Auden conveys more, and does so more beautifully, in these two lines than the model does in the entirety of that completion
1
0
2
You’d think paying by the token would reward word economy, and yet there are so many long phrases which say so little. It’s almost like literary sleight of hand. The phrases are constructed as if to trick you into thinking there’s more there than there really is
1
0
3
No offense, but this is not great lol. It feels like there’s a dearth of literary taste among AI researchers
we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story
3
0
11
@lcastricato do you know if there’s been any serious study of this?
0
0
0
If you’re doing RL on reasoning chains, does the “style” of the reasoning matter? Do you learn a meaningfully different policy if you train on, say, chains in the style of Serre’s work vs. Bourbaki vs. Grothendieck (ignoring differences in subject area)?
1
0
0
It shocked me that, with the right scaling method, you can outperform finetuning or get SOTA with a 4x smaller model. Inference-time compute with LLMs is having a moment right now, and now diffusion model users can finally join in on the fun.
Got a diffusion model? What if there were a way to: - Get SOTA text-to-image prompt fidelity, with no extra training! - Steer continuous and discrete (e.g. text) diffusions - Beat larger models using less compute - Outperform fine-tuning - And keep your stats friends happy !?
0
0
4
Come check out our poster this Saturday!
Excited to announce that our work on LLM Daily Oracle news eval will be presented at the NeurIPS 2024 Adaptive Foundation Model workshop on Saturday. @ameliadai_ @rteehas @agentic_ai_lab
0
1
4
Looking for a language model benchmark that will never be out of date? Curious how well language models can reason and generalize temporally? Check out the fantastic work done by our MS student @ameliadai_ (who will be applying to PhD programs this cycle, btw)
Will LLMs ever get out-dated? Can LLMs predict the future? Today, we release Daily Oracle, a daily news QA benchmark testing LLM’s temporal generalization and forecasting capability. 🧵
0
0
6
Squirrels 🔜
Can neuro-inspired ANN architectures be useful for motor control in quadruped robots? We translate neural circuits in the limbs and spinal cord of mammals into an ANN architecture controlling quadruped locomotion. w/ @venkyp2000, @LerrelPinto, @neurograce
0
0
3
For more, stop by our poster this morning! https://t.co/dXEpku6ayx
https://t.co/QTQDutu9n6
arxiv.org
Current language models are unable to quickly learn new concepts on the fly, often requiring a more involved finetuning process to learn robustly. Prompting in-context is not robust to context...
0
0
1
On Slang identification, a task consisting of 80 hand-selected recent Twitter slang terms and 120 sampled slang terms from a larger archive, we again outperform In-Context Learning and improve with additional few-shot examples
1
0
1
CoLLEGe is able to outperform In-Context Learning by a wide margin on difficult verbal reasoning tasks from a GRE verbal reasoning prep book, and even more when definitions are provided as few-shot examples.
1
0
1
Our method can generate plausible definitions for new concepts with only a few examples.
1
0
1
To train, we construct few-shot examples from The Pile and use an objective which mimics pretraining, allowing CoLLEGe to transfer to tasks like GRE Verbal Reasoning, Slang Identification, and Definition Generation without any additional finetuning.
1
0
0
We propose a method for generate embeddings for new concepts on-the-fly, given few-shot support sequences which use those concepts. These embeddings augment the embeddings of a pretrained LLM, enabling it to model query sequences containing those concepts.
1
0
1