PrateekVJoshi Profile Banner
Prateek Joshi Profile
Prateek Joshi

@PrateekVJoshi

Followers
7K
Following
8K
Media
545
Statuses
10K

infra investor at @MoxxieVentures | author of 13 AI books | nvidia alum | recovering founder

San Francisco Bay Area
Joined September 2009
Don't wanna be here? Send us removal request.
@PrateekVJoshi
Prateek Joshi
4 hours
search APIs are infra, not features. freshness, vertical recall, and provenance need SLAs as opposed to footnotes. generic search loses to domain-tuned pipelines with traceable sources. own crawl cadence, schema, and disambiguation. expose “why this result” like a receipt.
0
0
0
@PrateekVJoshi
Prateek Joshi
9 hours
dropping out of yc to do yc. it’s called metayc
0
0
0
@TuttleCapital
Matthew Tuttle
18 hours
7
6
113
@PrateekVJoshi
Prateek Joshi
1 day
except worker 17 of course
@uwukko
wukko
1 day
cloudflare workers are really fast as of late god damn
0
0
0
@PrateekVJoshi
Prateek Joshi
1 day
the wait is over
@nikitabier
Nikita Bier
1 day
After 10 years of asking, we are finally rolling out synced drafts: Drafts written on the X app will now be there when you login on the web. Happy poasting.
0
0
1
@PrateekVJoshi
Prateek Joshi
1 day
incredible! thank you for doing the lord’s work
@karpathy
Andrej Karpathy
2 days
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
0
0
1
@PrateekVJoshi
Prateek Joshi
1 day
this is how you improve the margins on your ai app
@ai_for_success
AshutoshShrivastava
3 days
Search OPENAI_API_KEY on GitHub and thank Vibe Coders.
0
0
0
@PrateekVJoshi
Prateek Joshi
1 day
this captures it well
@karpathy
Andrej Karpathy
2 days
@zenitsu_aprntc Good question, it's basically entirely hand-written (with tab autocomplete). I tried to use claude/codex agents a few times but they just didn't work well enough at all and net unhelpful, possibly the repo is too far off the data distribution.
0
0
0
@PrateekVJoshi
Prateek Joshi
2 days
in case you were wondering how complicated it really is
0
0
0
@PrateekVJoshi
Prateek Joshi
2 days
nobody trusts your model. everyone trusts win-rate on their golden set.
0
0
1
@PrateekVJoshi
Prateek Joshi
2 days
observability for long-running agents is a weird beast. single LLM calls are easy to debug. long chains are crime scenes. you need spans, traces, replays, and “what if we changed step 7?” store tool i/o, prompt diffs, and policy versions like source code. one-click
0
0
1
@PrateekVJoshi
Prateek Joshi
3 days
messy workflows → trainable environments. your logs, UIs, and APIs are an unlabeled RL gym waiting to happen. instrument them and you compress weeks of onboarding into hours. capture states, actions, feedback, and failure modes. simulate before writing to prod. don’t beg for
0
0
1
@PrateekVJoshi
Prateek Joshi
3 days
“under similar conditions”
@docmilanfar
Peyman Milanfar
4 days
when you compare your results to the other paper's “under similar conditions”
0
0
0
@PrateekVJoshi
Prateek Joshi
3 days
everything is about creating leverage
@MeghanBobrowsky
Meghan Bobrowsky
4 days
Saturday scoop: Thinking Machines Lab co-founder Andrew Tulloch has joined Meta, the startup confirmed. W/ @keachhagey
0
0
0
@PrateekVJoshi
Prateek Joshi
3 days
RL for agents = ops. everyone romanticizes rewards in RL. but the hard part is budgets, replay buffers, and “don’t break prod”. treat RL like SREs treat incidents. define observable rewards tied to outcomes. keep offline datasets clean and counterfactuals handy.
0
0
2
@PrateekVJoshi
Prateek Joshi
4 days
RLP
@sivareddyg
Siva Reddy
4 days
Lot of insights in @YejinChoinka's talk on RL training. Rip for next token prediction training (NTP) and welcome to Reinforcement Learning Pretraining (RLP). #COLM2025 No place to even stand in the room.
0
0
1
@PrateekVJoshi
Prateek Joshi
4 days
nice findings
@josancamon19
Joan Cabezas
5 days
🧵 As AI labs race to scale RL, one question matters: when should you stop pre-training and start RL? We trained 5 Qwen models (0.6B→14B) with RL on GSM8K and found something wild: Small models see EMERGENCE-LIKE jumps. Large models see diminishing returns. The scaling law?
0
0
2
@PrateekVJoshi
Prateek Joshi
4 days
Exa is onto something special. Amazing product!
@ExaAILabs
Exa
5 days
Introducing Exa 2.0 Breakthroughs in our AI research and engineering have enabled us to build both the fastest search API (<350ms) and the highest quality search on the market. Product and technical deep dive below:
0
0
2
@PrateekVJoshi
Prateek Joshi
4 days
agent orchestration is the control plane. models are table stakes. routing, memory, tools, and rollback are turning out to be the differentiators. it's like kubernetes but for decisions and side effects. give pm-friendly levers. who can call what tool, what gets cached, when
0
0
1
@PrateekVJoshi
Prateek Joshi
4 days
Stefano Ermon is the OG of diffusion LLMs. Here's my convo with him. Really insightful. And he's great at explaining things.
@_inception_ai
Inception
4 days
Our CEO @StefanoErmon joined the Infinite Curiosity Podcast and shared how our Mercury diffusion LLMs deliver faster, cheaper models and why diffusion is reshaping coding, reasoning, and multimodal AI. Thanks for having him on @PrateekVJoshi! https://t.co/9gTrf5IEMV
0
0
1
@PrateekVJoshi
Prateek Joshi
5 days
The RL in IRL stands for reinforcement learning. remember this when someone wants to meet you IRL. you're welcome!
0
0
0