tunedgradient Profile Banner
Viraj Profile
Viraj

@tunedgradient

Followers
46
Following
750
Media
41
Statuses
386

applied AI & intelligence. notes on how we think, and how our machines learn.

San Francisco
Joined August 2025
Don't wanna be here? Send us removal request.
@tunedgradient
Viraj
9 days
FFmpeg still running on 25-year-old CPUs and even the Sega Dreamcast is such a flex. Solid C code never really dies.
@FFmpeg
FFmpeg
10 days
We support CPUs that are over 25 years old, no problem!
6
36
719
@tunedgradient
Viraj
3 hours
great feature: stays fully conversational ofc, you just say ‘can you assign this bug and post a slack update?’ under the hood, MCP fires small, scoped verbs (e.g. assign_ticket). its verb computing: actions as infra with chat as the ui.
@gdb
Greg Brockman
17 hours
mcp support in chatgpt:
0
0
0
@tunedgradient
Viraj
8 hours
sft is like lego blocks, rl is like ikea furniture with missing screws. everyone keeps promising plug-n-play, but rl isn’t like sft. it’s not just 'load data, train model.' every env+algo pairing adds new quirks, new knobs to tune, new failure modes. the bright side: every
@khoomeik
Rohan Pandey
15 hours
we’re approaching the end of 2025 and there’s still no plug-n-play RL lib in the interrim: - i built a shitty version of this (llamagym) - RL started working (o1) - oss found out how it worked (r1) - “RL env” became the new buzzword - oss RL envs unified around `verifiers`
0
0
1
@tunedgradient
Viraj
12 hours
what i like about detailbench is that it flips the usual framing. it’s not 'can the model follow instructions' but 'can it notice when something’s just a bit off.' catching a wrong digit in the middle of a translation is a very different skill than writing fluent text. most llms
@xeophon_
Xeophon
19 hours
(Re-)launching DetailBench! After a lot of feedback in the comments that LLMs should *always* notify about mistakes, I changed the scoring. Well, let's just say it didn't really help 🙃
Tweet media one
1
0
0
@tunedgradient
Viraj
15 hours
sums up llm progress: basically evolutionary search. countless runs branch out, most dead-end, a few checkpoints survive and get refined. evals act as the selection pressure, turning random exploration into structured, compounding progress.
@DefenderOfBasic
Defender
20 hours
does this make sense?
Tweet media one
0
0
0
@tunedgradient
Viraj
16 hours
great opportunity to make an impact! building evals right is basically compounding leverage on the whole field.
@shyamalanadkat
shyamal
17 hours
i'm hiring for a new team @openai: Applied Evals our goal is to build the world's best evals for the economically valuable tasks our customers care about most. we'll execute as a group of high‑taste engineers, combining hands-on, unscalable efforts with systems that others can
0
0
0
@tunedgradient
Viraj
17 hours
not surprised. for non-math folks: the hodge conjecture is one of the clay millennium problems, about which geometric shapes can be described algebraically. and in this case, the grandiosity of this paper itself gives it away, nobody quietly knocks down a clay prize in bullet
@littmath
Daniel Litt
20 hours
yet another nonsense LLM-aided paper about the Hodge conjecture on arXiv this morning :(
0
0
0
@tunedgradient
Viraj
1 day
exploit prompt format bias, don’t mistake it for magic. llms just inherit a huge prior from oceans of html/xml, so they're good at <open>/<close> delimiters, tolerant of free-form text, and less brittle than json’s quotes/commas. so you may feel better results. but if you need
@mattshumer_
Matt Shumer
1 day
Literally just append this to the prompt. The results are incredible: “Before answering, <think> inside XML tags for at least 15 paragraphs.”
0
0
0
@tunedgradient
Viraj
1 day
weird realization: ai folks are building h-nets (models that just invent their own tokens from raw bytes). physicists are chasing the muon g-2 with crazy precision. both sounded like they might change the game. instead, we mostly got….cleaner numbers. are we just maxing rigor
@Dorialexander
Alexander Doria
2 days
Between h-net disappointment and mixed muon experiments, might not seem the best of times working on model design. Yet as someone primarily on the data side I’m increasingly convinced something is off.
0
0
1
@tunedgradient
Viraj
1 day
people talk about 'memorization vs understanding' like it’s a clean line. never was. humans memorize patterns & call it intuition. models compress patterns & we call it memorization. so maybe 'understanding' is just the name we give to useful compression that transfers.
@fchollet
François Chollet
1 day
A student who truly understands F=ma can solve more novel problems than a Transformer that has memorized every physics textbook ever written.
0
0
0
@tunedgradient
Viraj
1 day
there's still so much confusion nowadays around parameter ct vs perf
@svpino
Santiago
2 days
This is a myopic answer that doesn't consider hardware or problem type. A 2B parameter model is both EXTREMELY fast when using the proper hardware and EXTREMELY accurate when used on the correct use case. There are no silver bullets. Good engineering is about considering the
0
0
1
@tunedgradient
Viraj
2 days
great thread on evals. feels like we’ve reached consensus: you can prototype fast without them, but once you’re aiming for reliability (esp in sensitive domains like health or legal), they’re the difference between pilot hell and prod.
@HenrySG
Henry Scott-Green
2 days
Are evals required to build great AI applications? There’s been a ton of discussion on this recently, and I wanted to share my POV working on evals for startups and enterprises at @OpenAI 👇
0
0
1
@tunedgradient
Viraj
2 days
what if the real bottleneck isn’t image data at all, but reasoning traces? if text-only 'thinking data' can boost generation, maybe the cheapest supervision ends up being the most powerful.
0
0
0
@tunedgradient
Viraj
2 days
7/ and in a flourish: gpt-4o generates multimodal visuals conditioned on flow fields, e.g. “diver with bubbles along streamlines.” not just pretty pictures, but tied to simulation data. ( https://t.co/TPswyXI8HB)
Tweet media one
0
0
0
@tunedgradient
Viraj
2 days
6/ it also doubles as a tutor. don’t know what a cfl number is? gpt explains and suggests defaults inline, turning advanced cfd setup into an interactive lesson. ( https://t.co/TPswyXI8HB)
1
0
0
@tunedgradient
Viraj
2 days
5/ post-sim, gpt can auto-generate scripts for “plot instantaneous cd over time” or “slice velocity at y=0.” it edits params, runs analysis, and visualizes results. ( https://t.co/TPswyXI8HB)
1
0
0
@tunedgradient
Viraj
2 days
4/ three agents, each steered by gpt-4o (5 will be better!): -preprocessing: builds 3d meshes from text/image (point-e pipeline) -solver: asks for reynolds, cfl, timestep, writes config files -postprocessing: scripts plots, drag/lift curves, streamlines, even photo-real
Tweet media one
1
0
0
@tunedgradient
Viraj
2 days
3/in cfdagent gpt is the planner, coder, and tutor that makes a classical ib solver usable from natural language. ( https://t.co/TPswyXI8HB)
Tweet media one
1
0
0
@tunedgradient
Viraj
2 days
2/ what ai is changing is not the physics but everything around it. geometry generation, meshing, parameter setup, scripting, visualization. the glue work that used to be the bottleneck. ( https://t.co/TPswyXI8HB)
Tweet media one
1
0
0
@tunedgradient
Viraj
2 days
1/ cfd has always been the domain of specialists with cad licenses, meshing scripts, solver configs, and weeks of hpc time. but now multimodal llms are starting to reshape cfd workflows. ( https://t.co/TPswyXI8HB)
Tweet media one
1
0
0
@tunedgradient
Viraj
2 days
aug-sept. so far has been an epic run on the codex-cli PR side tbh - reasoning summary upgrades and a big wave of fixes. dense cluster of improvements all landing at once, right at the inflection.
@embirico
Alexander Embiricos
2 days
📈
0
0
1