
Luyu Gao
@luyu_gao
Followers
2K
Following
274
Media
13
Statuses
159
Research Scientist @MistralAI Work on Code Agents (Devstrals) PhD candidate (on leave) @CarnegieMellon @LTIatCMU
Joined April 2020
RT @b_roziere: Excited to release Devstral Medium and a new version of Devstral Small! .Devstral medium reaches 61.8% on SWE-bench verified….
0
2
0
Among papers I wrote, GradCache is one of my favorite. Very glad to see it still being useful🚀.
We trained all of the Nomic Embed models on limited compute. One trick that helped us train SoTA embeddings on 16 H100s? GradCache, a gradient checkpointing-like technique tailored for contrastive learning. I kept forgetting how it works, so I dug into the math and wrote about it
1
1
26
We released Devstral, a powerful code agent foundation model. With an apache-2 license, it is now the best open source model on swe-bench. One of the fun projects I worked on this year.
Meet Devstral, our SOTA open model designed specifically for coding agents and developed with @allhands_ai .
2
1
25
RT @MinyangTian1: SciCode is our new benchmark that challenges LMs to code solutions for scientific problems from advanced papers. The chal….
0
62
0
RT @anubha_haha: Recent studies show program-aided prompting in LLMs improves reasoning tasks, but do they "know what they know"? 🤔.Our #NA….
0
5
0
[4/4] Really want to thank all JAX authors for building such a fun framework to play with! @jekbradbury @froystig @SingularMattrix @cdleary @jakevdp @DougalMaclaurin @apaszke @zhangqiaorjc.
0
0
6
[1/4] So, I decided to seriously use JAX, and it didn't take long for me to realize its power. With just a couple hundred lines of code, you can do data&tensor parallelism on @huggingface transformers. I've created a toolkit to make this more accessible.
5
17
134
Attending my my first #NeurIPS conference. Excited to chat with people about retrieval, RAG or just any other LLM phenomena. #NeurIPS2023.
0
1
34
RT @sivil_taram: 🇸🇬We will present the "Active Retrieval Augmented Generation" paper in the Poster Session 2, December 8th, 16:00 SGT. Feel….
0
1
0
RT @tengyuma: 📢 Introducing Voyage AI @Voyage_AI_!. Founded by a talented team of leading AI researchers and me 🚀🚀. We build state-of-the-….
0
95
0
RT @_akhaliq: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. paper page: https://t.….
0
39
0
RT @arankomatsuzaki: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. With GPT-4 Code….
0
58
0