Luyu Gao @luyu_gao X Profile

Luyu Gao

@luyu_gao

Followers

2K

Following

275

Media

13

Statuses

159

Research Scientist @MistralAI Work on Code Agents (Devstrals) PhD candidate (on leave) @CarnegieMellon @LTIatCMU

Joined April 2020

Don't wanna be here? Send us removal request.

Luyu Gao

@luyu_gao

3 years

[1/4] Introducing HyDE, a method to unsupervisedly build dense retrievers. HyDE zero-shot instructs GPT to generate a fictional document and re-encodes it with Contriever to search in its embedding space. Put it simply, casting retrieval-like behavior in GPT into real retrieval.

9

74

397

Luyu Gao

@luyu_gao

13 days

RT @b_roziere: Excited to release Devstral Medium and a new version of Devstral Small! .Devstral medium reaches 61.8% on SWE-bench verified….

0

2

0

Luyu Gao

@luyu_gao

23 days

previously i resisted copilot for months thinking it was too stochastic and never upgraded to cursor .but now i start using claude code right after it's out. lesson from past: AI is going to power the world. use them is the only way to get an accurate mental world model.

0

13

Luyu Gao

@luyu_gao

1 month

Among papers I wrote, GradCache is one of my favorite. Very glad to see it still being useful🚀.

Zach Nussbaum

@zach_nussbaum

2 months

We trained all of the Nomic Embed models on limited compute. One trick that helped us train SoTA embeddings on 16 H100s? GradCache, a gradient checkpointing-like technique tailored for contrastive learning. I kept forgetting how it works, so I dug into the math and wrote about it

1

26

Luyu Gao

@luyu_gao

2 months

We released Devstral, a powerful code agent foundation model. With an apache-2 license, it is now the best open source model on swe-bench. One of the fun projects I worked on this year.

Mistral AI

@MistralAI

2 months

Meet Devstral, our SOTA open model designed specifically for coding agents and developed with @allhands_ai .

2

1

25

Luyu Gao

@luyu_gao

9 months

RT @dchaplot: We just release two new models:.Ministral 3B and Ministral 8B. It’s been only a year since the release of Mistral 7B, and yet….

0

26

0

Luyu Gao

@luyu_gao

1 year

RT @MinyangTian1: SciCode is our new benchmark that challenges LMs to code solutions for scientific problems from advanced papers. The chal….

0

62

0

Luyu Gao

@luyu_gao

1 year

RT @anubha_haha: Recent studies show program-aided prompting in LLMs improves reasoning tasks, but do they "know what they know"? 🤔.Our #NA….

0

5

0

Luyu Gao

@luyu_gao

1 year

RT @wellecks: Version II of the tutorial on neural theorem proving:. Some new additions.- Train a model that gets 2….

0

26

0

Luyu Gao

@luyu_gao

1 year

[4/4] Really want to thank all JAX authors for building such a fun framework to play with! @jekbradbury @froystig @SingularMattrix @cdleary @jakevdp @DougalMaclaurin @apaszke @zhangqiaorjc.

0

6

Luyu Gao

@luyu_gao

1 year

[3/4] Personally, I've decided to switch to JAX due to its modern approach to parallelism, which can be automatic or semi-automatic. The JAX compiler takes care of many demanding tasks, such as managing the communication of activations and gradients.

1

0

6

Luyu Gao

@luyu_gao

1 year

[2/4] I wrote the tool in a minimalist style focusing on introducing core JAX functionalities and the convenient JAX eco-system. I hope it can help people learn the fundamentals of model parallelism.

1

0

4

Luyu Gao

@luyu_gao

1 year

[1/4] So, I decided to seriously use JAX, and it didn't take long for me to realize its power. With just a couple hundred lines of code, you can do data&tensor parallelism on @huggingface transformers. I've created a toolkit to make this more accessible.

github.com

Supercharge huggingface transformers with model parallelism. - luyug/magix

5

17

134

Luyu Gao

@luyu_gao

2 years

Attending my my first #NeurIPS conference. Excited to chat with people about retrieval, RAG or just any other LLM phenomena. #NeurIPS2023.

0

1

34

Luyu Gao

@luyu_gao

2 years

RT @sivil_taram: 🇸🇬We will present the "Active Retrieval Augmented Generation" paper in the Poster Session 2, December 8th, 16:00 SGT. Feel….

0

1

0

Luyu Gao

@luyu_gao

2 years

RT @gneubig: I have a post-doc position open at @LTIatCMU, starting Summer or Fall 2024. If you are interested in working with me at CMU on….

docs.google.com

Graham Neubig's lab (https://www.cs.cmu.edu/~neulab/) has a post-doc position open starting Summer 2024. If you are interested, please apply through the following form. Please also feel free to get...

0

64

0

Luyu Gao

@luyu_gao

2 years

RT @tengyuma: 📢 Introducing Voyage AI @Voyage_AI_!. Founded by a talented team of leading AI researchers and me 🚀🚀. We build state-of-the-….

0

95

0

Luyu Gao

@luyu_gao

2 years

RT @swyx: Prompt2Model: Generating Deployable Models from Natural Language Instructions. Not just "write prompt ge….

0

72

0

Luyu Gao

@luyu_gao

2 years

RT @_akhaliq: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. paper page: https://t.….

0

39

0

Luyu Gao

@luyu_gao

2 years

RT @omarsar0: Using GPT-4 Code Interpreter to Boost Mathematical Reasoning. This paper proposes a zero-shot prompting technique for GPT-4 C….

0

69

0

Luyu Gao

@luyu_gao

2 years

RT @arankomatsuzaki: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification. With GPT-4 Code….

0

58

0