luyu_gao Profile Banner
Luyu Gao Profile
Luyu Gao

@luyu_gao

Followers
2K
Following
275
Media
13
Statuses
159

Research Scientist @MistralAI Work on Code Agents (Devstrals) PhD candidate (on leave) @CarnegieMellon @LTIatCMU

Joined April 2020
Don't wanna be here? Send us removal request.
@luyu_gao
Luyu Gao
3 years
[1/4] Introducing HyDE, a method to unsupervisedly build dense retrievers. HyDE zero-shot instructs GPT to generate a fictional document and re-encodes it with Contriever to search in its embedding space. Put it simply, casting retrieval-like behavior in GPT into real retrieval.
Tweet media one
9
74
397
@luyu_gao
Luyu Gao
13 days
RT @b_roziere: Excited to release Devstral Medium and a new version of Devstral Small! .Devstral medium reaches 61.8% on SWE-bench verified….
0
2
0
@luyu_gao
Luyu Gao
23 days
previously i resisted copilot for months thinking it was too stochastic and never upgraded to cursor .but now i start using claude code right after it's out. lesson from past: AI is going to power the world. use them is the only way to get an accurate mental world model.
0
0
13
@luyu_gao
Luyu Gao
1 month
Among papers I wrote, GradCache is one of my favorite. Very glad to see it still being useful🚀.
@zach_nussbaum
Zach Nussbaum
2 months
We trained all of the Nomic Embed models on limited compute. One trick that helped us train SoTA embeddings on 16 H100s? GradCache, a gradient checkpointing-like technique tailored for contrastive learning. I kept forgetting how it works, so I dug into the math and wrote about it
Tweet media one
1
1
26
@luyu_gao
Luyu Gao
2 months
We released Devstral, a powerful code agent foundation model. With an apache-2 license, it is now the best open source model on swe-bench. One of the fun projects I worked on this year.
@MistralAI
Mistral AI
2 months
Meet Devstral, our SOTA open model designed specifically for coding agents and developed with @allhands_ai .
Tweet media one
2
1
25
@luyu_gao
Luyu Gao
9 months
RT @dchaplot: We just release two new models:.Ministral 3B and Ministral 8B. It’s been only a year since the release of Mistral 7B, and yet….
0
26
0
@luyu_gao
Luyu Gao
1 year
RT @MinyangTian1: SciCode is our new benchmark that challenges LMs to code solutions for scientific problems from advanced papers. The chal….
0
62
0
@luyu_gao
Luyu Gao
1 year
RT @anubha_haha: Recent studies show program-aided prompting in LLMs improves reasoning tasks, but do they "know what they know"? 🤔.Our #NA….
0
5
0
@luyu_gao
Luyu Gao
1 year
RT @wellecks: Version II of the tutorial on neural theorem proving:. Some new additions.- Train a model that gets 2….
0
26
0
@luyu_gao
Luyu Gao
1 year
[4/4] Really want to thank all JAX authors for building such a fun framework to play with! @jekbradbury @froystig @SingularMattrix @cdleary @jakevdp @DougalMaclaurin @apaszke @zhangqiaorjc.
0
0
6
@luyu_gao
Luyu Gao
1 year
[3/4] Personally, I've decided to switch to JAX due to its modern approach to parallelism, which can be automatic or semi-automatic. The JAX compiler takes care of many demanding tasks, such as managing the communication of activations and gradients.
1
0
6
@luyu_gao
Luyu Gao
1 year
[2/4] I wrote the tool in a minimalist style focusing on introducing core JAX functionalities and the convenient JAX eco-system. I hope it can help people learn the fundamentals of model parallelism.
1
0
4
@luyu_gao
Luyu Gao
1 year
[1/4] So, I decided to seriously use JAX, and it didn't take long for me to realize its power. With just a couple hundred lines of code, you can do data&tensor parallelism on @huggingface transformers. I've created a toolkit to make this more accessible.
Tweet card summary image
github.com
Supercharge huggingface transformers with model parallelism. - luyug/magix
5
17
134
@luyu_gao
Luyu Gao
2 years
Attending my my first #NeurIPS conference. Excited to chat with people about retrieval, RAG or just any other LLM phenomena. #NeurIPS2023.
0
1
34
@luyu_gao
Luyu Gao
2 years
RT @sivil_taram: 🇸🇬We will present the "Active Retrieval Augmented Generation" paper in the Poster Session 2, December 8th, 16:00 SGT. Feel….
0
1
0
@luyu_gao
Luyu Gao
2 years
RT @tengyuma: 📢 Introducing Voyage AI @Voyage_AI_!. Founded by a talented team of leading AI researchers and me 🚀🚀. We build state-of-the-….
0
95
0
@luyu_gao
Luyu Gao
2 years
RT @swyx: Prompt2Model: Generating Deployable Models from Natural Language Instructions. Not just "write prompt ge….
0
72
0
@luyu_gao
Luyu Gao
2 years
RT @_akhaliq: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. paper page: https://t.….
0
39
0
@luyu_gao
Luyu Gao
2 years
RT @omarsar0: Using GPT-4 Code Interpreter to Boost Mathematical Reasoning. This paper proposes a zero-shot prompting technique for GPT-4 C….
0
69
0
@luyu_gao
Luyu Gao
2 years
RT @arankomatsuzaki: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. With GPT-4 Code….
0
58
0