Jean de Nyandwi @Jeande_d X Profile

Jean de Nyandwi

@Jeande_d

Followers

46K

Following

48K

Media

1K

Statuses

6K

Researcher @LTIatCMU • Multimodal NLP, Post-training, Data, Evals • CMU MS 24. Blog: https://t.co/1BEFLZAqe7 ML: https://t.co/7PkTyDvuri

https://t.co/hfC2f5Ttaz

Planet Earth

Joined March 2017

Don't wanna be here? Send us removal request.

Jean de Nyandwi

@Jeande_d

2 years

A NEW ARTICLE 🔥🔥 The first article of Deep Learning Revision Research Blog(introduced recently) is out: "The Transformer Blueprint: A Holistic Guide to the Transformer Neural Network Architecture". In the article, we discuss the core mechanics of transformer neural

18

181

674

Jean de Nyandwi

@Jeande_d

20 hours

Great lectures on deep generative modelling, covering a whole bunch of topics, model families, and learning algorithms in generative models space. And applications in vision and natural language processing, and reinforcement learning.

1

14

Jean de Nyandwi

@Jeande_d

20 hours

Deep Generative Models Lectures: videos, slides, notes, papers https://t.co/Ssvh86lIze

kuleshov-group.github.io

The site for the Open Deep Generative Models course.

0

2

Jean de Nyandwi

@Jeande_d

20 hours

Great lectures on deep generative modelling, covering a whole bunch of topics, model families, and learning algorithms in generative models space. And applications in vision and natural language processing, and reinforcement learning.

1

14

Jean de Nyandwi

@Jeande_d

5 days

https://t.co/sYGwkDiRJa

Soumith Chintala

@soumithchintala

5 days

. @PyTorch has snowballed into something never imagined, while still keeping our core values intact! feels incredible. -with @ezyang and @apaszke

0

2

14

Chancharik Mitra @ ICCV

@chancharikm

6 days

Your VLMs/VLAs have hidden potential. Sparse Attention Vectors (SAVs) unlocks few-shot, mechanistically interpretable test-time adaptation—beating LoRA with ~20 heads. 🚀 See you today #ICCV2025 🌺 📍Poster #254 (10/21, Session 1) @berkeley_ai @CMU_Robotics @MITIBMLab Links:

1

2

19

Yinghui He

@yinghui_he_

7 days

Claude Skills shows performance benefits from leveraging LLM skill catalogs at inference time. Our previous work (linked under thread 5/5) showed the same 6 months ago! 🌟Our new work, STAT, shows that leveraging skills during training can greatly help too‼️, e.g., Qwen can

8

37

187

Jean de Nyandwi

@Jeande_d

7 days

Videos: https://t.co/DxagpWetUF Code:

0

1

Jean de Nyandwi

@Jeande_d

7 days

Applied Machine Learning - Cornell CS5785 "Starting from the very basics, covering all of the most important ML algorithms and how to apply them in practice. Executable Jupyter notebooks (and as slides)". 80 videos. All publicly available.

1

3

14

Jean de Nyandwi

@Jeande_d

8 days

Course lectures: cme295 stanford edu/

0

4

Jean de Nyandwi

@Jeande_d

8 days

[New course] Transformers & Large Language Models, CME 295 Stanford Yet another excellent course on transformers and large language models with a consolidated curriculum starting right away from transformers. Topics/lectures: - Intro to transformers - Transformer-based models &

2

19

Jean de Nyandwi

@Jeande_d

8 days

Just crossed 100+ citations...🥳🥳

0

11

Jean de Nyandwi

@Jeande_d

11 days

Exactly the same 2 courses I thought seeing the post. There are more across other schools, look at: - advanced NLP at CMU( https://t.co/jvyGAeBvAb) - new inference class of its kind in academia CMU ( https://t.co/kQOfe4s7jw) - LLM systems CMU( https://t.co/dfrfS2ywdB) - large

sankalp

@dejavucoder

12 days

your honor i object, i dont know about harvard but stanford literally releases SOTA courses

0

9

67

Jean de Nyandwi

@Jeande_d

14 days

When Karpathy drops a nano repo, X timeline is great again! Everyone is basically happy.

Andrej Karpathy

@karpathy

14 days

Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,

0

1

Jean de Nyandwi

@Jeande_d

14 days

Statistics 110: Probability - Harvard Inarguably, one of the classic/world's best probability courses on the web!!

0

2

18

Jean de Nyandwi

@Jeande_d

16 days

https://t.co/xeAus3LmXd

tokens-for-thoughts.notion.site

A hitchhiker's guide into LLM post-training, by Han Fang and Karthik A Sankararaman

0

6

Jean de Nyandwi

@Jeande_d

16 days

An excellent technical guide on LLM post-raining covering SFT(supervised finetuning), RL rewards such as RLHF/human preferences, RLAIF/constitutional-AI, RLVR/verifiable outcomes, process-supervised and rubric rewards. Also covers common RL training algorithms from PPO, GRPO, and

2

1

19

Jean de Nyandwi

@Jeande_d

18 days

Deep learning course by legendary Andrew Ng is public now. New 2025 lectures are out!!

0

5

Jean de Nyandwi

@Jeande_d

23 days

Thanks for writing this piece @johnschulman2. Very insightful and lots of practical findings. https://t.co/fwutv1Tfbj

thinkingmachines.ai

How LoRA matches full training performance more broadly than expected.

0

1

Jean de Nyandwi

@Jeande_d

23 days

LoRA Without Regret - Recent Blog from Thinking Machines TL/DR: LoRA actually matches full supervised fine-tuning(SFT) when you get the details right. Nearly same sample efficiency, loss(or better), same final performance. Some plain points: - Apply LoRA to ALL layers,

1

0

10

Jean de Nyandwi

@Jeande_d

27 days

Thanks for using the visual, @ColdFusion_TV. I am a big fan of your tech videos. The amount of work you put into research shows in the videos.

0

3