Joey @joeyism101 X Profile

Joey

@joeyism101

Followers

108

Following

11K

Media

28

Statuses

1K

Senior ML Engineer

https://t.co/piHayDNr9Q

🍁

Joined December 2014

Don't wanna be here? Send us removal request.

Akshay 🚀

@akshay_pachaar

3 days

Did Stanford just kill LLM fine-tuning? This new paper from Stanford, called Agentic Context Engineering (ACE), proves something wild: you can make models smarter without changing a single weight. Here's how it works: Instead of retraining the model, ACE evolves the context

41

141

785

Rohan Paul

@rohanpaul_ai

4 days

New paper from @Google is a major memory breakthrough for AI agents. ReasoningBank helps an AI agent improve during use by learning from its wins and mistakes. To succeed in real-world settings, LLM agents must stop making the same mistakes. ReasoningBank memory framework

55

293

2K

Adora

@AdoraHQ_

6 days

You're paying more. Getting less. And have no idea what's working. Adora launches today at @advertisingweek NYC—giving enterprise marketers transparency and control over their performance marketing. #AWNY

0

1

Gordon Wetzstein

@GordonWetzstein

1 month

How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4

22

98

590

Shengqu Cai

@prime_cai

23 days

Some random thoughts I've been having about video world model/long video generation since working on Mixture of Contexts (whose title could also be "Learnable Sparse Attention for Long Video Generation"): 🚨Semi-long Post Alert🚨 1. Learnable sparse attention is still underrated

Gordon Wetzstein

@GordonWetzstein

1 month

How do we generate videos on the scale of minutes, without drifting or forgetting about the historical context? We introduce Mixture of Contexts. Every minute-long video below is the direct output of our model in a single pass, with no post-processing, stitching, or editing. 1/4

6

38

214

VraserX e/acc

@VraserX

1 month

LLMs just learned how to explain their own thoughts. Not only do they generate answers, they can now describe the internal processes that led to those answers… and get better at it with training. We’re officially entering the era of self-interpretable AI. Models aren’t just

116

264

1K

Rohan Paul

@rohanpaul_ai

1 month

One of the best paper of the recent week. The big takeaway: scaling up model size doesn’t just make models smarter in terms of knowledge, it makes them last longer on multi-step tasks, which is what really matters for agents. Shows that small models can usually do one step

21

115

686

Rohan Paul

@rohanpaul_ai

2 months

Another paper claiming really BIG result. The First method to achieve 99.9% on AIME 2025 with open-source models! 🤯 DeepConf uses a model’s own token confidence to keep only its strongest reasoning, with GPT-OSS-120B while cutting tokens by up to 84.7% compared to standard

20

153

824

Simo Ryu

@cloneofsimo

2 months

Nice work concurrent to ASFT that tries to diffuse in pixel space by decoding its coordinate instead. We may be near death of Latent diffusion

5

28

284

Pramod Goyal

@goyal__pramod

3 months

I never knew how beautifully connected Softmax and Cross-entropy were till I read this.

9

105

1K

pdawg

@prathamgrv

3 months

found a cool yt channel where someone dumbs down complex ML papers. absolute gold.

16

222

4K

Floor Eijkelboom

@FEijkelboom

3 months

Flow Matching (FM) is one of the hottest ideas in generative AI - and it’s everywhere at #ICML2025. But what is it? And why is it so elegant? 🤔 This thread is an animated, intuitive intro into (Variational) Flow Matching - no dense math required. Let's dive in! 🧵👇

111

258

2K

ℏεsam

@Hesamation

4 months

This is a solid 29 videos playlist on how to build DeepSeek from scratch. It covers theory and code, from the very foundations to advanced. Self attention, multi-head [latent] attention, GQA, how DeepSeek rewrote Quantization, etc. One video a day and you’ll finish in a month.

6

199

1K

AK

@_akhaliq

4 months

Text-Aware Image Restoration with Diffusion Models

3

34

222

Jyo Pari

@jyo_pari

4 months

What if an LLM could update its own weights? Meet SEAL🦭: a framework where LLMs generate their own training data (self-edits) to update their weights in response to new inputs. Self-editing is learned via RL, using the updated model’s downstream performance as reward.

131

524

3K

Rohan Paul

@rohanpaul_ai

4 months

Google opensources DeepSearch stack Get started with building Fullstack Agents using Gemini 2.5 and LangGraph 📝 Overview This project has a React frontend and a FastAPI backend built on LangGraph. The agent turns user input into search queries with Gemini, fetches web results

10

122

890

Rohan Paul

@rohanpaul_ai

4 months

An AI agent upgraded its own tools and doubled its bug-fix score. Darwin-style search plus Gödel-style self-reference cracked coding tasks. Pass rate jumps from 20 % to 50 % on SWE-bench-Verified Darwin Gödel Machine (DGM) is a coding agent that rewrites its own code, tests

6

54

251

Akshay 🚀

@akshay_pachaar

5 months

KV caching in LLMs, clearly explained (with visuals):

14

166

2K

elvis

@omarsar0

5 months

LLMs Get Lost in Multi-turn Conversation The cat is out of the bag. Pay attention, devs. This is one of the most common issues when building with LLMs today. Glad there is now paper to share insights. Here are my notes:

98

632

4K

Simo Ryu

@cloneofsimo

6 months

10B parameter DiT trained on 80M images, all owned by @freepik . Model commercially usable, raw model without distillation, open sourced. Proud to demonstrate first model-training project with our client @freepik: "F-Lite", from @FAL

Iván de Prado

@ivanprado

6 months

🚀Excited to announce F Lite: a new open-source text-to-image model by @freepik and @FAL! The first at this scale that’s both open-source and trained exclusively on licensed, high-quality data.🧵

20

66

500