Joey Profile
Joey

@joeyism101

Followers
108
Following
11K
Media
28
Statuses
1K

Senior ML Engineer

🍁
Joined December 2014
Don't wanna be here? Send us removal request.
@akshay_pachaar
Akshay 🚀
3 days
Did Stanford just kill LLM fine-tuning? This new paper from Stanford, called Agentic Context Engineering (ACE), proves something wild: you can make models smarter without changing a single weight. Here's how it works: Instead of retraining the model, ACE evolves the context
41
141
785
@rohanpaul_ai
Rohan Paul
4 days
New paper from @Google is a major memory breakthrough for AI agents. ReasoningBank helps an AI agent improve during use by learning from its wins and mistakes. To succeed in real-world settings, LLM agents must stop making the same mistakes. ReasoningBank memory framework
55
293
2K
@AdoraHQ_
Adora
6 days
You're paying more. Getting less. And have no idea what's working. Adora launches today at @advertisingweek NYC—giving enterprise marketers transparency and control over their performance marketing. #AWNY
0
1
1
@GordonWetzstein
Gordon Wetzstein
1 month
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
22
98
590
@prime_cai
Shengqu Cai
23 days
Some random thoughts I've been having about video world model/long video generation since working on Mixture of Contexts (whose title could also be "Learnable Sparse Attention for Long Video Generation"): 🚹Semi-long Post Alert🚹 1. Learnable sparse attention is still underrated
@GordonWetzstein
Gordon Wetzstein
1 month
How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4
6
38
214
@VraserX
VraserX e/acc
1 month
LLMs just learned how to explain their own thoughts. Not only do they generate answers, they can now describe the internal processes that led to those answers
 and get better at it with training. We’re officially entering the era of self-interpretable AI. Models aren’t just
116
264
1K
@rohanpaul_ai
Rohan Paul
1 month
One of the best paper of the recent week. The big takeaway: scaling up model size doesn’t just make models smarter in terms of knowledge, it makes them last longer on multi-step tasks, which is what really matters for agents. Shows that small models can usually do one step
21
115
686
@rohanpaul_ai
Rohan Paul
2 months
Another paper claiming really BIG result. The First method to achieve 99.9% on AIME 2025 with open-source models! đŸ€Ż DeepConf uses a model’s own token confidence to keep only its strongest reasoning, with GPT-OSS-120B while cutting tokens by up to 84.7% compared to standard
20
153
824
@cloneofsimo
Simo Ryu
2 months
Nice work concurrent to ASFT that tries to diffuse in pixel space by decoding its coordinate instead. We may be near death of Latent diffusion
5
28
284
@goyal__pramod
Pramod Goyal
3 months
I never knew how beautifully connected Softmax and Cross-entropy were till I read this.
9
105
1K
@prathamgrv
pdawg
3 months
found a cool yt channel where someone dumbs down complex ML papers. absolute gold.
16
222
4K
@FEijkelboom
Floor Eijkelboom
3 months
Flow Matching (FM) is one of the hottest ideas in generative AI - and it’s everywhere at #ICML2025. But what is it? And why is it so elegant? đŸ€” This thread is an animated, intuitive intro into (Variational) Flow Matching - no dense math required. Let's dive in! đŸ§”đŸ‘‡
111
258
2K
@Hesamation
ℏΔsam
4 months
This is a solid 29 videos playlist on how to build DeepSeek from scratch. It covers theory and code, from the very foundations to advanced. Self attention, multi-head [latent] attention, GQA, how DeepSeek rewrote Quantization, etc. One video a day and you’ll finish in a month.
6
199
1K
@_akhaliq
AK
4 months
Text-Aware Image Restoration with Diffusion Models
3
34
222
@jyo_pari
Jyo Pari
4 months
What if an LLM could update its own weights? Meet SEAL🩭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.
131
524
3K
@rohanpaul_ai
Rohan Paul
4 months
Google opensources DeepSearch stack Get started with building Fullstack Agents using Gemini 2.5 and LangGraph 📝 Overview This project has a React frontend and a FastAPI backend built on LangGraph. The agent turns user input into search queries with Gemini, fetches web results
10
122
890
@rohanpaul_ai
Rohan Paul
4 months
An AI agent upgraded its own tools and doubled its bug-fix score. Darwin-style search plus Gödel-style self-reference cracked coding tasks. Pass rate jumps from 20 % to 50 % on SWE-bench-Verified Darwin Gödel Machine (DGM) is a coding agent that rewrites its own code, tests
6
54
251
@akshay_pachaar
Akshay 🚀
5 months
KV caching in LLMs, clearly explained (with visuals):
14
166
2K
@omarsar0
elvis
5 months
LLMs Get Lost in Multi-turn Conversation The cat is out of the bag. Pay attention, devs. This is one of the most common issues when building with LLMs today. Glad there is now paper to share insights. Here are my notes:
98
632
4K
@cloneofsimo
Simo Ryu
6 months
10B parameter DiT trained on 80M images, all owned by @freepik . Model commercially usable, raw model without distillation, open sourced. Proud to demonstrate first model-training project with our client @freepik: "F-Lite", from @FAL
@ivanprado
IvĂĄn de Prado
6 months
🚀Excited to announce F Lite: a new open-source text-to-image model by @freepik and @FAL! The first at this scale that’s both open-source and trained exclusively on licensed, high-quality data.đŸ§”
20
66
500