Sasha Rush @srush_nlp X Profile

Sasha Rush

@srush_nlp

Followers

73K

Following

4K

Media

1K

Statuses

8K

Researcher at Cursor https://t.co/cZl0wTfqGz

https://t.co/tQPZgm0s45

New York, NY

Joined December 2015

Don't wanna be here? Send us removal request.

Tanay Jaipuria

@tanayj

2 days

The Cursor launch post on Hacker News only had 14 votes and "I still can't figure out if this is just some sarcastic joke software" as the top comment. Incredible

56

2K

Cursor

@cursor_ai

3 days

Semantic search improves our agent's accuracy across all frontier models, especially in large codebases where grep alone falls short. Learn more about our results and how we trained an embedding model for retrieving code.

65

100

1K

John Yang

@jyangballin

3 days

New eval! Code duels for LMs ⚔️ Current evals test LMs on *tasks*: "fix this bug," "write a test" But we code to achieve *goals*: maximize revenue, cut costs, win users Meet CodeClash: LMs compete via their codebases across multi-round tournaments to achieve high-level goals

24

87

353

Daniel Jeffries

@Dan_Jeffries1

3 days

Finding composer fantastic and small targeted tasks but ALSO really, really strong as a researcher because it is so fast and so well trained at reading code and docs intelligently! I've often found Codex struggling with an issue and I have it output the problem and then tell

Sasha Rush

@srush_nlp

10 days

Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://t.co/DX9bbalx0B Excited for the potential of building specialized models to help in critical domains.

1

3

23

Daniel Litt

@littmath

4 days

But getting dressed in the morning, to which our civilization is now devoting so many of its marginal resources, is possessed of neither beauty nor symmetry. Indeed, putting one’s socks on before one’s shoes is quite different from putting on one’s shoes before one’s socks.

41

72

863

Sasha Rush

@srush_nlp

4 days

Think about this talk a lot. There was a time when people were bullish on "feed all the modalities to the LLM," but it didn't really pan out as I would have expected. The discrete / continuous divide remains a interesting challenge in deep learning.

Conference on Language Modeling

@COLM_conf

4 days

COLM Keynotes: Luke Zettlemoyer Mixed-modal Language Modeling https://t.co/8FdhhrfOnG

11

18

223

dr. jack morris

@jxmnop

4 days

defending today 🥲

250

113

3K

(((ل()(ل() 'yoav))))👾

@yoavgo

4 days

Apparently there was already a discussion of this talk here after it was given live. But I missed it. So now will do: I do care about potential AI harms, but disagree with pretty much everything he says here! will elaborate in the response.

Conference on Language Modeling

@COLM_conf

4 days

COLM Keynote: Nicholas Carlini Are LLMs worth it? https://t.co/p4PkmP8AZ8

3

4

24

Sasha Rush

@srush_nlp

4 days

There was a lot of twitter conversation about this talk when it was given. Here is the full version. https://t.co/pOvn1YYHHR

Conference on Language Modeling

@COLM_conf

4 days

COLM Keynote: Nicholas Carlini Are LLMs worth it? https://t.co/p4PkmP8AZ8

3

17

166

Conference on Language Modeling

@COLM_conf

4 days

COLM Keynote: Shirley Ho Building a Polymathic Foundation Model for Science https://t.co/APJTkvYJdZ

1

5

18

Mehul Mohan

@mehulmpt

8 days

I found @cursor_ai's composer model extremely useful for many specific tasks. A very short feedback loop means I can prompt it 4-5 times, and it can instantly get to the code I wanted. Speed is really a 10x factor in AI

5

154

⏧

@syntacrobat

7 days

literally how do ML people survive without lifting tensor dimensions into the type system? isn't that like the number one thing youd immediately want

73

37

857

Sasha Rush

@srush_nlp

7 days

Correct take.

Shane Gu

@shaneguML

8 days

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.

5

16

345

Shane Gu

@shaneguML

8 days

Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.

23

69

804

will brown

@willccbb

7 days

ok composer-1 is pretty nuts, and the code it writes is quite nice. probably my new daily driver for many things not quite as galaxy-brain as codex, but it's SO fast that you can use it sync instead of async, and very quickly iterate on fixes. follows instructions very well

27

20

541

Nozz

@NoahEpstein_

9 days

cursor just made every $200/month copilot subscription look like a scam dropped today with their own coding model what took 8 hours of manual coding now takes 30 seconds and it runs 8 versions of itself in parallel to pick the best solution while github's charging $20/month

656

105

2K

Kenton Murray

@kentonmurray

8 days

David is an amazing PhD advisor and everyone should apply.

David Chiang

@davidweichiang

9 days

I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang @pentagonalize on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!

0

4

22

jellybean ❄️

@jdchawla29

8 days

https://t.co/GbogvA48oR

23

57

3K

Robert Nishihara

@robertnishihara

10 days

Cursor just released a frontier coding model with 4x faster generation. They will be speaking at Ray Summit about their journey building a frontier coding model. - Training on 1000s of GPUs - Scaling 100,000s of sandboxed coding environments - Custom training infrastructure with

8

20

164

Subham Sahoo

@ssahoo_

8 days

Overwhelmed by the number of Diffusion LLM papers? 🌊 Same here 😭 So I’m starting a Discrete Diffusion Reading Group (@diffusion_llms) with my favorite disciples @jdeschena and @zhihanyang_ ✨ We’ll cover everything—from theory to empirics, from language to molecules. Join

20

40

318