srush_nlp Profile Banner
Sasha Rush Profile
Sasha Rush

@srush_nlp

Followers
73K
Following
4K
Media
1K
Statuses
8K

Researcher at Cursor https://t.co/cZl0wTfqGz

New York, NY
Joined December 2015
Don't wanna be here? Send us removal request.
@tanayj
Tanay Jaipuria
2 days
The Cursor launch post on Hacker News only had 14 votes and "I still can't figure out if this is just some sarcastic joke software" as the top comment. Incredible
56
56
2K
@cursor_ai
Cursor
3 days
Semantic search improves our agent's accuracy across all frontier models, especially in large codebases where grep alone falls short. Learn more about our results and how we trained an embedding model for retrieving code.
65
100
1K
@jyangballin
John Yang
3 days
New eval! Code duels for LMs ⚔️ Current evals test LMs on *tasks*: "fix this bug," "write a test" But we code to achieve *goals*: maximize revenue, cut costs, win users Meet CodeClash: LMs compete via their codebases across multi-round tournaments to achieve high-level goals
24
87
353
@Dan_Jeffries1
Daniel Jeffries
3 days
Finding composer fantastic and small targeted tasks but ALSO really, really strong as a researcher because it is so fast and so well trained at reading code and docs intelligently! I've often found Codex struggling with an issue and I have it output the problem and then tell
@srush_nlp
Sasha Rush
10 days
Composer is a new model we built at Cursor. We used RL to train a big MoE model to be really good at real-world coding, and also very fast. https://t.co/DX9bbalx0B Excited for the potential of building specialized models to help in critical domains.
1
3
23
@littmath
Daniel Litt
4 days
But getting dressed in the morning, to which our civilization is now devoting so many of its marginal resources, is possessed of neither beauty nor symmetry. Indeed, putting one’s socks on before one’s shoes is quite different from putting on one’s shoes before one’s socks.
41
72
863
@srush_nlp
Sasha Rush
4 days
Think about this talk a lot. There was a time when people were bullish on "feed all the modalities to the LLM," but it didn't really pan out as I would have expected. The discrete / continuous divide remains a interesting challenge in deep learning.
@COLM_conf
Conference on Language Modeling
4 days
COLM Keynotes: Luke Zettlemoyer Mixed-modal Language Modeling https://t.co/8FdhhrfOnG
11
18
223
@jxmnop
dr. jack morris
4 days
defending today 🥲
250
113
3K
@yoavgo
(((ل()(ل() 'yoav))))👾
4 days
Apparently there was already a discussion of this talk here after it was given live. But I missed it. So now will do: I do care about potential AI harms, but disagree with pretty much everything he says here! will elaborate in the response.
@COLM_conf
Conference on Language Modeling
4 days
COLM Keynote: Nicholas Carlini Are LLMs worth it? https://t.co/p4PkmP8AZ8
3
4
24
@srush_nlp
Sasha Rush
4 days
There was a lot of twitter conversation about this talk when it was given. Here is the full version. https://t.co/pOvn1YYHHR
@COLM_conf
Conference on Language Modeling
4 days
COLM Keynote: Nicholas Carlini Are LLMs worth it? https://t.co/p4PkmP8AZ8
3
17
166
@COLM_conf
Conference on Language Modeling
4 days
COLM Keynote: Shirley Ho Building a Polymathic Foundation Model for Science https://t.co/APJTkvYJdZ
1
5
18
@mehulmpt
Mehul Mohan
8 days
I found @cursor_ai's composer model extremely useful for many specific tasks. A very short feedback loop means I can prompt it 4-5 times, and it can instantly get to the code I wanted. Speed is really a 10x factor in AI
5
5
154
@syntacrobat
7 days
literally how do ML people survive without lifting tensor dimensions into the type system? isn't that like the number one thing youd immediately want
73
37
857
@srush_nlp
Sasha Rush
7 days
Correct take.
@shaneguML
Shane Gu
8 days
Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.
5
16
345
@shaneguML
Shane Gu
8 days
Hot take: DAgger (Ross 2011) should be the first paper you read to get into RL, instead of Sutton's book. Maybe also read scheduled sampling (Bengio 2015). And before RL, study supervised learning thoroughly.
23
69
804
@willccbb
will brown
7 days
ok composer-1 is pretty nuts, and the code it writes is quite nice. probably my new daily driver for many things not quite as galaxy-brain as codex, but it's SO fast that you can use it sync instead of async, and very quickly iterate on fixes. follows instructions very well
27
20
541
@NoahEpstein_
Nozz
9 days
cursor just made every $200/month copilot subscription look like a scam dropped today with their own coding model what took 8 hours of manual coding now takes 30 seconds and it runs 8 versions of itself in parallel to pick the best solution while github's charging $20/month
656
105
2K
@kentonmurray
Kenton Murray
8 days
David is an amazing PhD advisor and everyone should apply.
@davidweichiang
David Chiang
9 days
I am recruiting a PhD student to work with me, Peter Cholak, Anand Pillay, and Andy Yang @pentagonalize on transformers and logic/model theory (or related topics). If you are interested, please email me with "FLaNN" in the subject line!
0
4
22
@jdchawla29
jellybean ❄️
8 days
23
57
3K
@robertnishihara
Robert Nishihara
10 days
Cursor just released a frontier coding model with 4x faster generation. They will be speaking at Ray Summit about their journey building a frontier coding model. - Training on 1000s of GPUs - Scaling 100,000s of sandboxed coding environments - Custom training infrastructure with
8
20
164
@ssahoo_
Subham Sahoo
8 days
Overwhelmed by the number of Diffusion LLM papers? 🌊 Same here 😭 So I’m starting a Discrete Diffusion Reading Group (@diffusion_llms) with my favorite disciples @jdeschena and @zhihanyang_ ✨ We’ll cover everything—from theory to empirics, from language to molecules. Join
20
40
318