_stevenkolawole Profile Banner
Steven Kolawole Profile
Steven Kolawole

@_stevenkolawole

Followers
2K
Following
7K
Media
64
Statuses
2K

Ẹ̀yin èèyàn mi! ❤️ Low-budget philosopher. ML Efficiency. PhD-ing @SCSatCMU. @ml_collective's poster child. Big brother to 3 amazing sisters.

Pittsburgh, PA
Joined December 2018
Don't wanna be here? Send us removal request.
@nsaphra
Naomi Saphra
2 months
I’m recruiting PhD students for 2026! If you are interested in robustness, training dynamics, interpretability for scientific understanding, or the science of LLM analysis you should apply. BU is building a huge LLM analysis/interp group and you’ll be joining at the ground floor.
@nsaphra
Naomi Saphra
9 months
Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!
17
126
668
@chhaviyadav_
Chhavi Yadav
2 months
🚀 Federated Learning (FL) promises collaboration without data sharing. While Cross-Device FL is a success and deployed widely in industry, we don’t see Cross-Silo FL (collaboration between organizations) taking off despite huge demand and interest. Why could this be the case? 🤔
1
12
23
@freakish_prince
Prince Mireku
2 months
It's a beautiful thing looking back to when I joined MLC last year. All I had was a scrambled mind and an interest in research. One step at a time; not an expert yet, but now I have direction and can confidently design and execute research projects thanks to @ml_collective.
@_stevenkolawole
Steven Kolawole
2 months
On today’s MLC-Ng Sunday Specials call, an old-timer returned after a research role at Max Planck l (undergrad at UNILAG); another casually mentioned his new role at Google. Joining these calls weekly is humbling-seeing so many talented folks doing big things from small places.
0
2
4
@josh_salako
Josh Salako
2 months
Thanks to this wonderful community, I got selected for the Google DeepMind scholarship for a Master's in Mathematical Science (AI for Science) at the African Institute for Mathematical Sciences, South Africa. Cheers to the next chapter🎉
@_stevenkolawole
Steven Kolawole
2 months
On today’s MLC-Ng Sunday Specials call, an old-timer returned after a research role at Max Planck l (undergrad at UNILAG); another casually mentioned his new role at Google. Joining these calls weekly is humbling-seeing so many talented folks doing big things from small places.
11
20
142
@_stevenkolawole
Steven Kolawole
2 months
On today’s MLC-Ng Sunday Specials call, an old-timer returned after a research role at Max Planck l (undergrad at UNILAG); another casually mentioned his new role at Google. Joining these calls weekly is humbling-seeing so many talented folks doing big things from small places.
8
17
109
@justinskycak
Justin Skycak
2 months
A general phenomenon that can sneak up on you when you’re at, say, the 99th percentile of a skill: At first, you’re exceptional enough that you receive praise from virtually everyone, and you may never go head-to-head with someone who can beat you. That is, until you join some
@justinskycak
Justin Skycak
2 months
Last year I had a conversation with someone who majored in physics at UChicago. He initially started in math & thought he was prepared having taken AP Calculus BC, but he got smacked in the face by the level of abstraction and proof-writing ability that was assumed. He couldn't
3
13
242
@Jayyanginspires
Jay Yang
2 months
Sleep hits harder when you’re exhausted. Food tastes better when you’ve gone hungry. Water tastes sweeter when you’ve been grinding. Music hits deeper when you’ve been in silence. Deprivation sharpens pleasure. Don’t run from suffering. That’s what gives life its color.
117
2K
13K
@justinskycak
Justin Skycak
2 months
Initially you think you're training hard & smart enough to max out your potential. Then you get to high enough level to realize that, while a few people there do have more inborn talent than you, many of them have no inborn edge over you -- but they're way ahead because they
@justinskycak
Justin Skycak
2 months
Last year I had a conversation with someone who majored in physics at UChicago. He initially started in math & thought he was prepared having taken AP Calculus BC, but he got smacked in the face by the level of abstraction and proof-writing ability that was assumed. He couldn't
9
23
433
@wongmjane
Jane Manchun Wong
3 months
Crazy how the MIT License started as a open source project and popped off so hard they built a university about it 🤯
66
184
8K
@_stevenkolawole
Steven Kolawole
3 months
Arxiv:
Tweet card summary image
arxiv.org
Cascade systems route computational requests to smaller models when possible and defer to larger models only when necessary, offering a promising approach to balance cost and quality in LLM...
@_stevenkolawole
Steven Kolawole
3 months
Excited to announce that "Semantic Agreement Enables Efficient Open-Ended LLM Cascades" got accepted to EMNLP 2025 Industry Track! Thread 🧵
0
0
5
@_stevenkolawole
Steven Kolawole
3 months
0
0
3
@ChimaobiOkite
Chimaobi Okite
3 months
Our paper (my first 😌) “Benchmarking and Improving LLM Robustness for Personalized Generation” was accepted to #EMNLP2025 findings🎉 🧑‍💻Data and Code: https://t.co/ULz29xqxz0 Paper: https://t.co/q22FPPpOaY 🧵
Tweet card summary image
arxiv.org
Recent years have witnessed a growing interest in personalizing the responses of large language models (LLMs). While existing evaluations primarily focus on whether a response aligns with a user's...
1
6
45
@_stevenkolawole
Steven Kolawole
3 months
Grateful for this collaboration with @duncansoiffer and @gingsmith--extending model routing/cascading to open-ended generation, using meaning-level consensus to further bridge the theory-practice gap! 📄 Paper: TBA (arxiv is still processing)
1
0
2
@_stevenkolawole
Steven Kolawole
3 months
This work started from a simple observation from my previous work on ABC ( https://t.co/r0NE5INY0Q): deployment reality != academic assumptions. Most cascade research assumes white-box access & homogeneous models, but that's not how LLMs are actually deployed in practice.
Tweet card summary image
arxiv.org
Adaptive inference schemes reduce the cost of machine learning inference by assigning smaller models to easier examples, attempting to avoid invocation of larger models when possible. In this work...
1
0
1
@_stevenkolawole
Steven Kolawole
3 months
Why this matters for production: Black-box APIs, constant model updates, heterogeneous serving, & cost pressure are realities that existing cascade methods struggle in forms to handle Semantic cascades solve all these constraints simultaneously-no retraining when models update.
1
0
2
@_stevenkolawole
Steven Kolawole
3 months
This breaks conventional ensemble wisdom. Results across translation, summarization, and QA (500M-70B parameters): - Match 70B quality at 40% computational cost - Up to 60% latency reduction - Works across any model family - Deployable today
1
0
3
@_stevenkolawole
Steven Kolawole
3 months
But here's the counterintuitive twist: *weaker models actually improve ensemble decisions* through disagreement signals A 500M model disagreeing with 8B models provides valuable uncertainty info. Even when the weak model is wrong, it helps identify when to defer to larger models
1
0
2
@_stevenkolawole
Steven Kolawole
3 months
Our solution: When models say "It's sunny today" vs "The weather is bright," they agree semantically despite different words. That meaning-level consensus predicts reliability better than individual confidence scores, with no training required.
1
0
1
@_stevenkolawole
Steven Kolawole
3 months
Quick primer: LLM cascading means routing queries to small, cheap models when possible, and only using large, expensive models when necessary. Think "escalation system" - start small, escalate when needed. Goal: maintain quality while cutting costs.
1
0
1
@_stevenkolawole
Steven Kolawole
3 months
Open-ended generation breaks traditional cascading Multiple valid answers exist, so "correct vs incorrect" deferral rules don't work Token-level confidence optimizes for next-token pred, not semantic quality. GPT/Claude APIs hide those scores anyway We needed sth different.
1
0
1