Tony Ginart @tginart X Profile

Tony Ginart

@tginart

Followers

280

Following

3K

Media

7

Statuses

788

AI Hacker. Scientist @SFResearch. Alum: @YCombinator @StanfordAILab.

https://t.co/ViSVz6VioA

Joined March 2020

Don't wanna be here? Send us removal request.

Tony Ginart

@tginart

4 days

2 wrong answers; 2 right answers. I will not elaborate.

Simon Sarris

@simonsarris

4 days

Who would you rather win?

0

Tony Ginart

@tginart

8 days

Why does gpt-5 feel so much smarter than other models in cursor even though they’re so close on benchmarks? The difference is just really stark at this point

0

Tony Ginart

@tginart

8 days

I’ll make the case that if OpenAI is committed to making sure AGI benefits all of humanity they should IPO immediately. Investing in AI should be democratized and returns shouldn’t go to a small number of privates investors.

Anjney Midha

@AnjneyMidha

8 days

the idea of an ‘IPO’ as an exit is a quaint legacy private frontier AI company shares are now trading with more frequency and liquidity than many publicly traded companies

0

1

Tony Ginart

@tginart

8 days

For product, stick with frontier to build an mvp but optimizing with open source is an option down the road. There’s also space for a startup to do the unglamorous work of making an open models usable. This wouldn’t require that much compute but does require data and taste.

0

Tony Ginart

@tginart

8 days

2️⃣ the polish and ergonomics around the open source models is significantly inferior — more so than the topline capability gap would suggest. Open source models are significantly more jagged and have more rough edges. So what…

1

0

Tony Ginart

@tginart

8 days

This is a fascinating trend I’ve been tracking closely as some sitting between ai training and ai product. 1️⃣ yes, in terms of core capabilities, open weights has kept ~6 months behind frontier and it seems like is slowly shrinking BUT…

Nathan Lambert

@natolambert

8 days

A ton of attention over the years goes to plots comparing open to closed models. The real trend that matters for AI impacts on society is the gap between closed frontier models and local consumer models. Local models passing major milestones will have major repercussions.

1

0

Tony Ginart

@tginart

10 days

Einstein’s relativity paper (1905) and Satoshi’s Bitcoin paper (2008) are two of a kind. Short, axiomatic, self-contained. Accessible yet rigorous. Resolute — as if carved in stone. Brilliantly simple, carrying the inevitability of sunrise.

0

Tony Ginart

@tginart

10 days

Tried hooking up LLMs to df earlier this year as a weekend project but tougher than expected… anyone got an open source df clone that runs in bash shell?

roon

@tszzl

10 days

Tarn Adam’s masterpiece “Dwarf Fortress” is possibly the best procedural generation ever applied to video games, to the point where the programmatic engravings the dwarves carve into the stones detailing the worlds history can actually be emotionally moving

0

1

Tony Ginart

@tginart

10 days

Yes, GRPO basically just means some kind of policy gradient method using some kind of group relative normalization around rewards.

will brown

@willccbb

10 days

when people say they're doing GRPO they don't mean they're doing *literal* GRPO as it was originally formulated. more of a vibe thing. it's like when people say they're doing SGD but they really mean they're doing AdamW

0

3

Tony Ginart

@tginart

11 days

Oh wow this brings back some ptsd

Aaron

@Norapom04

11 days

https://t.co/zYWsytogLo

0

Tony Ginart

@tginart

14 days

Yes the models will be working autonomously for hours or days in a year or two. Yes they will still get helplessly confused and require oversight. Yes it will be weird.

0

roon

@tszzl

16 days

unbelievable

210

227

8K

Tony Ginart

@tginart

2 months

So it turns out that for function calling, audio pipelined systems work better than end-2-end omni models (for now) — and both incur a performance degradation relative to pure text. Had a lot of fun working on this with @HuanzhiMao

Salesforce AI Research

@SFResearch

2 months

BFCL Audio: A Benchmark for Audio-Native Function Calling 🎙️ Function calling benchmarks focus exclusively on text, but voice interfaces are critical for enterprise call centers and customer support where precision matters. BFCL Audio evaluates models on conversational speech

0

4

Tony Ginart

@tginart

2 months

Can’t help but feel a bit sad, torchtune was one of my favorites!

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

2 months

torchtune was deprecated last month and a new repo for post-training at scale is being developed at Facebook. Hoping to hear more about this new library soon!

0

2

Tony Ginart

@tginart

2 months

Current RL paradigm won’t fix ai reliability because RL doesn’t meaningfully improve fluid intelligence. It just hill-climbs crystallized intelligence in narrow domains. Current RL paradigm will just make the models increasingly jagged. I still think we need something new.

0

3

Tony Ginart

@tginart

2 months

Not enough people talk about how insane diffusion model training seems to autoregressive people… That’s the real black magick

kalomaze

@kalomaze

2 months

you left out the "deeply painful to train + productionize" part

1

0

7

Tony Ginart

@tginart

2 months

Gpt-4.1 is seriously underrated in cursor. (1) so much faster than thinking models (2) follows instructions, doesn’t make rogue changes (3) basically on par with top models for small and medium sized tasks

0

1

Tony Ginart

@tginart

3 months

there are theorems that are true but can’t be proved, and there are theorems that are arbitrarily hard to prove. So we’ll always have a frontier to push on, no matter how good ai gets! Mathematics will be ok I think. A lot more computer-aided proofs in the coming years tho

Dave White

@_Dave__White_

3 months

the openai IMO news hit me pretty heavy this weekend i'm still in the acute phase of the impact, i think i consider myself a professional mathematician (a characterization some actual professional mathematicians might take issue with, but my party my rules) and i don't think i

0

2

Tony Ginart

@tginart

3 months

btw not a dunk on cursor. this is rare. i happily use it everyday.

1

0

1

Tony Ginart

@tginart

3 months

Degenerate repetitions in the wild on @cursor_ai! This is why we need the LZ Penalty: https://t.co/V8Bktb5je8

1

0

2