Viraj @tunedgradient X Profile

Viraj

@tunedgradient

Followers

46

Following

750

Media

41

Statuses

386

applied AI & intelligence. notes on how we think, and how our machines learn.

San Francisco

Joined August 2025

Don't wanna be here? Send us removal request.

Viraj

@tunedgradient

9 days

FFmpeg still running on 25-year-old CPUs and even the Sega Dreamcast is such a flex. Solid C code never really dies.

FFmpeg

@FFmpeg

10 days

We support CPUs that are over 25 years old, no problem!

6

36

719

Viraj

@tunedgradient

3 hours

great feature: stays fully conversational ofc, you just say ‘can you assign this bug and post a slack update?’ under the hood, MCP fires small, scoped verbs (e.g. assign_ticket). its verb computing: actions as infra with chat as the ui.

Greg Brockman

@gdb

17 hours

mcp support in chatgpt:

0

Viraj

@tunedgradient

8 hours

sft is like lego blocks, rl is like ikea furniture with missing screws. everyone keeps promising plug-n-play, but rl isn’t like sft. it’s not just 'load data, train model.' every env+algo pairing adds new quirks, new knobs to tune, new failure modes. the bright side: every

Rohan Pandey

@khoomeik

15 hours

we’re approaching the end of 2025 and there’s still no plug-n-play RL lib in the interrim: - i built a shitty version of this (llamagym) - RL started working (o1) - oss found out how it worked (r1) - “RL env” became the new buzzword - oss RL envs unified around `verifiers`

0

1

Viraj

@tunedgradient

12 hours

what i like about detailbench is that it flips the usual framing. it’s not 'can the model follow instructions' but 'can it notice when something’s just a bit off.' catching a wrong digit in the middle of a translation is a very different skill than writing fluent text. most llms

Xeophon

@xeophon_

19 hours

(Re-)launching DetailBench! After a lot of feedback in the comments that LLMs should *always* notify about mistakes, I changed the scoring. Well, let's just say it didn't really help 🙃

1

0

Viraj

@tunedgradient

15 hours

sums up llm progress: basically evolutionary search. countless runs branch out, most dead-end, a few checkpoints survive and get refined. evals act as the selection pressure, turning random exploration into structured, compounding progress.

Defender

@DefenderOfBasic

20 hours

does this make sense?

0

Viraj

@tunedgradient

16 hours

great opportunity to make an impact! building evals right is basically compounding leverage on the whole field.

shyamal

@shyamalanadkat

17 hours

i'm hiring for a new team @openai: Applied Evals our goal is to build the world's best evals for the economically valuable tasks our customers care about most. we'll execute as a group of high‑taste engineers, combining hands-on, unscalable efforts with systems that others can

0

Viraj

@tunedgradient

17 hours

not surprised. for non-math folks: the hodge conjecture is one of the clay millennium problems, about which geometric shapes can be described algebraically. and in this case, the grandiosity of this paper itself gives it away, nobody quietly knocks down a clay prize in bullet

Daniel Litt

@littmath

20 hours

yet another nonsense LLM-aided paper about the Hodge conjecture on arXiv this morning :(

0

Viraj

@tunedgradient

1 day

exploit prompt format bias, don’t mistake it for magic. llms just inherit a huge prior from oceans of html/xml, so they're good at <open>/<close> delimiters, tolerant of free-form text, and less brittle than json’s quotes/commas. so you may feel better results. but if you need

Matt Shumer

@mattshumer_

1 day

Literally just append this to the prompt. The results are incredible: “Before answering, <think> inside XML tags for at least 15 paragraphs.”

0

Viraj

@tunedgradient

1 day

weird realization: ai folks are building h-nets (models that just invent their own tokens from raw bytes). physicists are chasing the muon g-2 with crazy precision. both sounded like they might change the game. instead, we mostly got….cleaner numbers. are we just maxing rigor

Alexander Doria

@Dorialexander

2 days

Between h-net disappointment and mixed muon experiments, might not seem the best of times working on model design. Yet as someone primarily on the data side I’m increasingly convinced something is off.

0

1

Viraj

@tunedgradient

1 day

people talk about 'memorization vs understanding' like it’s a clean line. never was. humans memorize patterns & call it intuition. models compress patterns & we call it memorization. so maybe 'understanding' is just the name we give to useful compression that transfers.

François Chollet

@fchollet

1 day

A student who truly understands F=ma can solve more novel problems than a Transformer that has memorized every physics textbook ever written.

0

Viraj

@tunedgradient

1 day

there's still so much confusion nowadays around parameter ct vs perf

Santiago

@svpino

2 days

This is a myopic answer that doesn't consider hardware or problem type. A 2B parameter model is both EXTREMELY fast when using the proper hardware and EXTREMELY accurate when used on the correct use case. There are no silver bullets. Good engineering is about considering the

0

1

Viraj

@tunedgradient

2 days

great thread on evals. feels like we’ve reached consensus: you can prototype fast without them, but once you’re aiming for reliability (esp in sensitive domains like health or legal), they’re the difference between pilot hell and prod.

Henry Scott-Green

@HenrySG

2 days

Are evals required to build great AI applications? There’s been a ton of discussion on this recently, and I wanted to share my POV working on evals for startups and enterprises at @OpenAI 👇

0

1

Viraj

@tunedgradient

2 days

what if the real bottleneck isn’t image data at all, but reasoning traces? if text-only 'thinking data' can boost generation, maybe the cheapest supervision ends up being the most powerful.

0

Viraj

@tunedgradient

2 days

7/ and in a flourish: gpt-4o generates multimodal visuals conditioned on flow fields, e.g. “diver with bubbles along streamlines.” not just pretty pictures, but tied to simulation data. ( https://t.co/TPswyXI8HB)

0

Viraj

@tunedgradient

2 days

6/ it also doubles as a tutor. don’t know what a cfl number is? gpt explains and suggests defaults inline, turning advanced cfd setup into an interactive lesson. ( https://t.co/TPswyXI8HB)

1

0

Viraj

@tunedgradient

2 days

5/ post-sim, gpt can auto-generate scripts for “plot instantaneous cd over time” or “slice velocity at y=0.” it edits params, runs analysis, and visualizes results. ( https://t.co/TPswyXI8HB)

1

0

Viraj

@tunedgradient

2 days

4/ three agents, each steered by gpt-4o (5 will be better!): -preprocessing: builds 3d meshes from text/image (point-e pipeline) -solver: asks for reynolds, cfl, timestep, writes config files -postprocessing: scripts plots, drag/lift curves, streamlines, even photo-real

1

0

Viraj

@tunedgradient

2 days

3/in cfdagent gpt is the planner, coder, and tutor that makes a classical ib solver usable from natural language. ( https://t.co/TPswyXI8HB)

1

0

Viraj

@tunedgradient

2 days

2/ what ai is changing is not the physics but everything around it. geometry generation, meshing, parameter setup, scripting, visualization. the glue work that used to be the bottleneck. ( https://t.co/TPswyXI8HB)

1

0

Viraj

@tunedgradient

2 days

1/ cfd has always been the domain of specialists with cad licenses, meshing scripts, solver configs, and weeks of hpc time. but now multimodal llms are starting to reshape cfd workflows. ( https://t.co/TPswyXI8HB)

1

0

Viraj

@tunedgradient

2 days

aug-sept. so far has been an epic run on the codex-cli PR side tbh - reasoning summary upgrades and a big wave of fixes. dense cluster of improvements all landing at once, right at the inflection.

Alexander Embiricos

@embirico

2 days

📈

0

1