Vivek Verma @vcubingx X Profile

Vivek Verma

@vcubingx

Followers

13K

Following

3K

Media

71

Statuses

310

math youtuber & researcher @openai

https://t.co/5NCBtXj8hN

Joined January 2017

Don't wanna be here? Send us removal request.

Vivek Verma

@vcubingx

2 years

New video! The attention mechanism is well known for its use in Transformers. But where does it come from? It's origins lie in fixing a strange problems of RNNs. Watch the video to learn about it! https://t.co/W6nqLH859P

4

67

527

will depue

@willdepue

3 months

@nelvOfficial ignore the gpt-5 name, o1/o3 were undeniably gpt-5 level and it just took us time to have confidence to bump the name.

12

7

210

vast.ai

@vast_ai

1 month

Stop waiting for GPU access. Start training.

0

12

58

David Huang

@davidhuang33176

3 months

Benchmarking model intelligence, particularly their ability to generalize robustly across diverse stateful and long-horizon tasks, was the focus of our new paper: Measuring General Intelligence with Generated Games.

2

1

6

Vivek Verma

@vcubingx

5 months

Unfortunately this means that the videos will have to take a back seat in the meanwhile. Math content creation is still a huge passion of mine, so I’ll be trying my best to push out content as time permits 🫡

2

0

45

Vivek Verma

@vcubingx

5 months

Life update: I’ve taken up a job as a researcher on the post-training team at @openai, working on reinforcement learning, function calling and other efforts! I’ve also graduated from @UCBerkeley where I’m grateful for a wonderful four years of learning and fun 😊

52

17

2K

CELSIUS Energy Drink

@CelsiusOfficial

2 months

Spritz Vibe. Limited Edition. Frosted over & fresh for the season, Spritz Vibe Sparkling Snowball Frost Limited Edition is here! CELSIUS. LIVE. FIT. GO.

533

729

11K

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

6 months

Generation of verifiable gyms is here. New hill to climb.

fly51fly

@fly51fly

6 months

[LG] Measuring General Intelligence with Generated Games V Verma, D Huang, W Chen, D Klein... [UC Berkeley] (2025) https://t.co/GN7LZWj2LD

4

8

119

Nicholas Tomlin

@NickATomlin

6 months

The long-term goal of AI is to build models that can handle arbitrary tasks, not just ones they’ve been trained on. We hope our new *benchmark generator* can help measure progress toward this vision

Vivek Verma

@vcubingx

6 months

🎮 Excited to announce gg-bench, a fully synthetic benchmark for LLMs consisting of games generated entirely by LLMs!! This benchmark centers around the fact that LLMs are capable of generating complex tasks that they themselves cannot even solve. 📄: https://t.co/kddoCgDkvd

4

30

182

Vivek Verma

@vcubingx

6 months

This is joint work with @davidhuang33176, William Chen, Dan Klein and @NickATomlin, who have been fantastic collaborators in this project. Please do support and follow them!

0

4

Vivek Verma

@vcubingx

6 months

We release the generated games, data generation process, and evaluation code in order to support future modeling work and expansion of our benchmark. 💻Check it out and give it a ⭐️at https://t.co/XB7lCsQrAD!

github.com

Measuring General Intelligence With Generated Games (Preprint) - vivek3141/gg-bench

1

0

7

Vivek Verma

@vcubingx

6 months

gg-bench is challenging: state-of-the-art LLMs such as GPT-4o and Claude 3.7 Sonnet achieve winrates of 7-9% on gg-bench using in-context learning, while reasoning models such as o1, o3-mini and DeepSeek-R1 achieve average winrates of 31-36%.

1

0

4

Vivek Verma

@vcubingx

6 months

gg-bench is created by (1) generating natural language descriptions of novel games (2) generating implementations of each game in code as a Gym environment and (3) training RL agents via self-play on the generated games. We measure the average winrate across all generated games.

1

0

5

Vivek Verma

@vcubingx

6 months

We believe the future of benchmarks are not static lists of questions but data generating processes, such that individual task instances can be regenerated at will. As such, (1) contaminated data points can be regenerated (2) tasks get difficult as LLMs get better.

1

0

5

Vivek Verma

@vcubingx

6 months

🎮 Excited to announce gg-bench, a fully synthetic benchmark for LLMs consisting of games generated entirely by LLMs!! This benchmark centers around the fact that LLMs are capable of generating complex tasks that they themselves cannot even solve. 📄: https://t.co/kddoCgDkvd

3

25

147

Grant Sanderson

@3blue1brown

9 months

I just put up a new video, which was a collaboration with Terence Tao about the cosmic distance ladder. You can find the full video on YouTube, and here's a bit of extra footage that didn't make it into the final.

91

610

6K

Vivek Verma

@vcubingx

11 months

This is honestly so incredibly tragic. I didn't know Suchir personally - but he was someone I looked up to a lot and viewed as a role model. I'm extremely sad to hear this news, I can't imagine the pains his family is going through. Rest in peace 🙏

BNO News

@BNONews

11 months

OpenAI whistleblower Suchir Balaji, who accused the company of breaking copyright law, found dead in apparent suicide

5

3

40

Vivek Verma

@vcubingx

1 year

Likewise, if I'm trying to model language, I want to structure my model in a way to satisfy it's properties. That way, it's able to generalize beyond the training data better. Do check out the three-part language modeling series!

1

10

City Bonfires

@CityBonfires

1 year

Meet the Chair Blanket – the ultimate outdoor essential that transforms any seat into a cozy retreat. 🌲 Waterproof on one side, plush Sherpa on the other, and packs up into a portable carry pouch! Perfect for fall bonfires, camping, and game days. 🏕️🏈 https://t.co/0f90NPPENk

0

29

457

Vivek Verma

@vcubingx

1 year

Sometimes, getting data is really hard. But, if I knew beforehand that I was dealing with a pendulum, then I'd probably choose from a set of periodic functions when modeling it's position.

1

0

4

Vivek Verma

@vcubingx

1 year

But, if I add more points, it's pretty apparent I'm trying to model some periodic function.

1

0

2

Vivek Verma

@vcubingx

1 year

If I have a couple data points telling me the pendulum's x position over time, then there are plenty of functions that "look" like they fit the data. These functions perfectly model the data I have, but are way off for points in-between.

1

0

1

Vivek Verma

@vcubingx

1 year

A small tidbit I cut out of my recent series on language modeling on why we need "simpler" models. Let's say I'm trying to model the behavior of a pendulum, which for small angles, looks like a sine wave.

1

34

adithya

@00aleph00

2 years

Well, waddya know -- looks like a NEW VIDEO! https://t.co/sAgMVYwTTj

adithya

@00aleph00

2 years

A few years ago, I learned a theorem called "Riemann's Existence Theorem". It literally took my breath away. It was so shocking and unexpected -- drawing a bridge between two distant continents of math. I knew in that moment that I had to make a video about it. But as I

3

41

331