Arthur Douillard @Ar_Douillard X Profile

Arthur Douillard

@Ar_Douillard

Followers

8K

Following

19K

Media

627

Statuses

5K

Distributed Learning @ deepmind | DiLoCo, DiPaCo. Continual Learning PhD @ Sorbonne

https://t.co/LGLbvcFgjI

London

Joined January 2016

Don't wanna be here? Send us removal request.

Arthur Douillard

@Ar_Douillard

9 months

We release today the next step for distributed training: --> Streaming DiLoCo with Overlapping Communication. TL;DR: train data-parallel across the world with low-bandwidth for the same performance: 400x less bits exchanged & huge latency tolerance

19

109

579

Patrick Collison

@patrickc

2 days

What’s the best thing written about why the remarkably vigorous and inventive France of the 70s and 80s (TGV, Minitel, Ariane, Rafale, Concorde, the world’s preeminent nuclear grid…) has not been nearly as visible in the 21st century? What went wrong?

390

239

3K

Arthur Douillard

@Ar_Douillard

4 days

I've used nano-banana to create that meme. The future is bright for AI slop amateurs like me.

Arthur Douillard

@Ar_Douillard

4 days

https://t.co/rlVSHgkqPK

1

2

67

Arthur Douillard

@Ar_Douillard

4 days

It's today!

Arthur Douillard

@Ar_Douillard

1 month

One of my teammate on DiLoCo will be speaking at the dAGI summit in SF on the 24Oct! Keith has always great hot takes on what distributed-first should be, from og privacy-oriented FL, to more recent LLM training with DiLoCo. Attend to his talk!

0

1

6

Arthur Douillard

@Ar_Douillard

4 days

@m_ryabinin @PrimeIntellect @mikasenghaas

0

2

6

Arthur Douillard

@Ar_Douillard

4 days

I found a cool tech report on combining DiLoCo with @m_ryabinin's SWARM pipelining with fault tolerance and checked what the author is doing now. I should have guessed: he's at @PrimeIntellect now.

2

4

67

rohan anil

@_arohan_

4 days

Reminds me of Adam variants.

Rishabh Agarwal

@agarwl_

5 days

The state of RL research for LLMs in 2025 and I am still missing probably several *PO. ngmi AAPO, BAPO, CAPO, CISPO, DAPO, EPO, FAPO, GAPO, GRPO, HAPO, KAPO, LAPO, MAPO, NAPO, ....., VAPO, ZAPO

1

2

43

Arthur Douillard

@Ar_Douillard

4 days

@nearcyan time to launch a new fund https://t.co/QkskC4l4kJ

near

@nearcyan

2 years

Excited to announce I'm launching a fund! GCCN is a new fund that exclusively invests in Nvidia with a mandatory 10 year lock-up period We also offer this fund as the benchmark with which all AI venture capital funds should be compared to learn more at https://t.co/jGLpLSTTL6!

0

8

Arthur Douillard

@Ar_Douillard

4 days

https://t.co/rlVSHgkqPK

Anthropic

@AnthropicAI

5 days

Today, we announced that we plan to expand our use of Google TPUs, securing approximately one million TPUs and more than a gigawatt of capacity in 2026.

12

51

2K

Yuandong Tian

@tydsh

6 days

Several of my team members + myself are impacted by this layoff today. Welcome to connect :)

474

287

7K

Simon Batzner

@simonbatzner

5 days

Our team at DeepMind is growing (again). 🚀 We're tackling grand challenges in semiconductors, magnets, energy materials, superconductors, and beyond. Join us! Two positions below.

13

41

762

Arthur Douillard

@Ar_Douillard

5 days

I’ve been promoted to Staff RS. Vain title etc. but feels good to see appreciation for distributed learning in DeepMind ☺️

39

10

450

Arthur Douillard

@Ar_Douillard

5 days

As a scifi-fi nerd, Starcloud is super exciting: https://t.co/Y7dPYz9ls2 but this applications sounds bullshit to me? Latency isn't going to take hours, and wildfires detection can wait 20s

1

3

16

Arthur Douillard

@Ar_Douillard

6 days

@metaai Oh, and one point raised by the authors: this super lookahead is only the *outer-optimizer*, and can be perfectly combined with any *inner-optimizer*, such as AdamW or Muon :)

1

0

12

Arthur Douillard

@Ar_Douillard

6 days

Non-distributed DiLoCo as a super lookahead: Kalluski et al. from @metaai released a study of using Nesterov on outer gradients: https://t.co/cYupiYyQru The algo that they nicknamed SNOO is basically with DiLoCo with M=1, meaning that every K steps, a delta is computed between

5

12

78

Nathan Barry

@nathanbarrydev

12 days

Research Log Day 0: DiLoCo Days I decided to a thesis around distributed low-communication training. Essentially, how can we train large models efficiently across distributed nodes and not be utterly destroyed by network latency and bandwidth? (1/n)

1

6

Arthur Douillard

@Ar_Douillard

11 days

True story when scaling to many many nodes

tender

@tenderizzation

12 days

two DiLoCo nodes exchanging gradients that cancel out

0

4

Google

@Google

12 days

Today, @GoogleResearch announced DeepSomatic, a new machine learning model developed with our partners, including @ucscgenomics and @ChildrensMercy, that accurately identifies genetic variants in cancer cells — a critical step for delivering more precise treatments for patients.

blog.google

An overview of DeepSomatic, a new AI tool that helps identify complex genetic variants in cancer cells.

98

280

2K

Arthur Douillard

@Ar_Douillard

12 days

Learned today that a startup is using Streaming DiLoCo to train a distributed AlphaFold-like model. Happy :)

2

0

24

David Pfau

@pfau

12 days

Very excited to be able to talk about something I've been working on for a while now - we're working with Commonwealth Fusion Systems, IMO the leading fusion startup in the world, to take our work on AI and tokamaks and make it work at the frontier of fusion energy.

Google DeepMind

@GoogleDeepMind

12 days

We’re announcing a research collaboration with @CFS_energy, one of the world’s leading nuclear fusion companies. Together, we’re helping speed up the development of clean, safe, limitless fusion power with AI. ⚛️

32

62

1K

prinz

@deredleritt3r

13 days

Google and Yale scientists have trained an LLM that has generated a novel hypothesis about cancer cellular behavior. This prediction was confirmed multiple times in vitro. - "What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in

Sundar Pichai

@sundarpichai

13 days

An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells. With more preclinical and clinical tests,

39

154

2K