_awettig Profile Banner
Alex Wettig Profile
Alex Wettig

@_awettig

Followers
2K
Following
2K
Media
23
Statuses
200

phd @Princeton / training agents @cursor_ai

Joined July 2022
Don't wanna be here? Send us removal request.
@shaoruu
ian
3 days
composer 1 adding a pumpkin hat to my website with @cursor_ai in less than 2 minutes!
12
9
190
@SuproteemSarkar
๐’๐ฎ๐ฉ๐ซ๐จ๐ญ๐ž๐ž๐ฆ ๐’๐š๐ซ๐ค๐š๐ซ
5 days
Who uses AI agents? How do agents impact output? How might agents change work patterns? New working paper studies usage + impacts of coding agents (1/n)
5
42
189
@niloofar_mire
Niloofar
4 days
I'm really excited about our new paper!! ๐Ÿ“ฃ 'Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs' Contrary to belief that RL ft degrades memorized knowledge, RL-enhanced models consistently outperform base/SFT on knowledge recall by 24pp! RL teaches
13
48
400
@jyangballin
John Yang
10 days
How CodeClash works: Two LMs enter a tournament. Each maintains its own codebase. Every round: 1. Edit Phase: LMs modify their codebases however they like 2. Competition phase: Codebases battle in an arena. 3. Repeat The LM that wins the majority of rounds is declared winner.
1
1
36
@LDanilek
Lee Danilek
15 days
Made a joke app for when people ask questions that @cursor_ai can answer:
1
4
22
@_awettig
Alex Wettig
17 days
@cursor_ai
Cursor
17 days
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
2
4
118
@srush_nlp
Sasha Rush
17 days
Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://t.co/DX9bbalx0B Excited for the potential of building specialized models to help in critical domains.
56
76
794
@ericzakariasson
eric zakariasson
17 days
composer is back, and its our first coding model trained in house. try it out in cursor 2.0 with best-of-n, worktrees and browser. so excited to get this out, team has been working incredibly hard to make it happen. as always, curious to hear what you think!
@cursor_ai
Cursor
17 days
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
119
49
1K
@sea_snell
Charlie Snell
1 month
SSI swag should just be Ilya t-shirts
0
1
21
@oh_that_hat
Hattie Zhou
1 month
>be me >be Claude >have read the internet but one day human asks me to draw >no training, no practice, just converting mental image to mouse movements like a toddler holding a crayon >pencil tool not working? np, I'll draw with the eraser >task failed successfully
5
7
236
@cursor_ai
Cursor
2 months
Cursor can now control your browser. Agent can take screenshots, improve UI, and debug client issues. Try our early preview with Sonnet 4.5.
246
521
6K
@KLieret
Kilian Lieret
2 months
We evaluated Anthropic's Sonnet 4.5 with our minimal agent. New record on SWE-bench verified: 70.6%! Same price/token as Sonnet 4, but takes more steps, ending up being more expensive. Cost analysis details & link to full trajectories in ๐Ÿงต
4
14
85
@sea_snell
Charlie Snell
2 months
yolo run summer is over scaling laws fall has arrived
@Preet_Sojitra03
Preet Sojitra
9 months
@sea_snell Just one more
1
1
63
@stuart_sul
Stuart Sul
3 months
MoE layers can be really slow. When training our coding models @cursor_ai, they ate up 27โ€“53% of training time. So we completely rebuilt it at the kernel level and transitioned to MXFP8. The result: 3.5x faster MoE layer and 1.5x end-to-end training speedup. We believe our
29
105
882
@_awettig
Alex Wettig
4 months
Presenting two posters at ICML over the next two days: - Both at 11am - 1:30pm - Both about how to improve pre-training with domains - Both at stall # E-2600 in East Exhibition Hall A-B (!) Tomorrow: WebOrganizer w/ @soldni & @kylelostat Thursday: MeCo by @gaotianyu1350
1
11
51
@_albertgu
Albert Gu
4 months
Tokenization is just a special case of "chunking" - building low-level data into high-level abstractions - which is in turn fundamental to intelligence. Our new architecture, which enables hierarchical *dynamic chunking*, is not only tokenizer-free, but simply scales better.
@sukjun_hwang
Sukjun (June) Hwang
4 months
Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
61
197
1K
@AnthropicAI
Anthropic
5 months
Anthropic staff realized they could ask Claude to buy things that werenโ€™t just food & drink. After someone randomly decided to ask it to order a tungsten cube, Claude ended up with an inventory full of (as it put it) โ€œspecialty metal itemsโ€ that it ended up selling at a loss.
64
210
4K
@_awettig
Alex Wettig
5 months
New paper cutting through the thicket of KV cache eviction methods!
@AdithyaNLP
Adithya Bhaskar
5 months
There are many KV cache-reduction methods, but a fair comparison is challenging. We propose a new unified metric called โ€œcritical KV footprintโ€. We compare existing methods and propose a new one - PruLong, which โ€œprunesโ€ certain attn heads to only look at local tokens. 1/7
0
1
17
@a1zhang
Alex L Zhang
6 months
Can GPT, Claude, and Gemini play video games like Zelda, Civ, and Doom II? ๐—ฉ๐—ถ๐—ฑ๐—ฒ๐—ผ๐—š๐—ฎ๐—บ๐—ฒ๐—•๐—ฒ๐—ป๐—ฐ๐—ต evaluates VLMs on Game Boy & MS-DOS games given only raw screen input, just like how a human would play. The best model (Gemini) completes just 0.48% of the benchmark! ๐Ÿงต๐Ÿ‘‡
23
78
560
@amanrsanger
Aman Sanger
6 months
Claude Sonnet 4 is much better at codebase understanding. Paired with recent improvements in Cursor, it's SOTA on large codebases
32
43
862