Ar_Douillard Profile Banner
Arthur Douillard Profile
Arthur Douillard

@Ar_Douillard

Followers
8K
Following
19K
Media
627
Statuses
5K

Distributed Learning @ deepmind | DiLoCo, DiPaCo. Continual Learning PhD @ Sorbonne

London
Joined January 2016
Don't wanna be here? Send us removal request.
@Ar_Douillard
Arthur Douillard
9 months
We release today the next step for distributed training: --> Streaming DiLoCo with Overlapping Communication. TL;DR: train data-parallel across the world with low-bandwidth for the same performance: 400x less bits exchanged & huge latency tolerance
19
109
579
@patrickc
Patrick Collison
2 days
What’s the best thing written about why the remarkably vigorous and inventive France of the 70s and 80s (TGV, Minitel, Ariane, Rafale, Concorde, the world’s preeminent nuclear grid…) has not been nearly as visible in the 21st century? What went wrong?
390
239
3K
@Ar_Douillard
Arthur Douillard
4 days
I've used nano-banana to create that meme. The future is bright for AI slop amateurs like me.
1
2
67
@Ar_Douillard
Arthur Douillard
4 days
It's today!
@Ar_Douillard
Arthur Douillard
1 month
One of my teammate on DiLoCo will be speaking at the dAGI summit in SF on the 24Oct! Keith has always great hot takes on what distributed-first should be, from og privacy-oriented FL, to more recent LLM training with DiLoCo. Attend to his talk!
0
1
6
@Ar_Douillard
Arthur Douillard
4 days
I found a cool tech report on combining DiLoCo with @m_ryabinin's SWARM pipelining with fault tolerance and checked what the author is doing now. I should have guessed: he's at @PrimeIntellect now.
2
4
67
@_arohan_
rohan anil
4 days
Reminds me of Adam variants.
@agarwl_
Rishabh Agarwal
5 days
The state of RL research for LLMs in 2025 and I am still missing probably several *PO. ngmi AAPO, BAPO, CAPO, CISPO, DAPO, EPO, FAPO, GAPO, GRPO, HAPO, KAPO, LAPO, MAPO, NAPO, ....., VAPO, ZAPO
1
2
43
@Ar_Douillard
Arthur Douillard
4 days
@nearcyan time to launch a new fund https://t.co/QkskC4l4kJ
@nearcyan
near
2 years
Excited to announce I'm launching a fund! GCCN is a new fund that exclusively invests in Nvidia with a mandatory 10 year lock-up period We also offer this fund as the benchmark with which all AI venture capital funds should be compared to learn more at https://t.co/jGLpLSTTL6!
0
0
8
@Ar_Douillard
Arthur Douillard
4 days
@AnthropicAI
Anthropic
5 days
Today, we announced that we plan to expand our use of Google TPUs, securing approximately one million TPUs and more than a gigawatt of capacity in 2026.
12
51
2K
@tydsh
Yuandong Tian
6 days
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
287
7K
@simonbatzner
Simon Batzner
5 days
Our team at DeepMind is growing (again). 🚀 We're tackling grand challenges in semiconductors, magnets, energy materials, superconductors, and beyond. Join us! Two positions below.
13
41
762
@Ar_Douillard
Arthur Douillard
5 days
I’ve been promoted to Staff RS. Vain title etc. but feels good to see appreciation for distributed learning in DeepMind ☺️
39
10
450
@Ar_Douillard
Arthur Douillard
5 days
As a scifi-fi nerd, Starcloud is super exciting: https://t.co/Y7dPYz9ls2 but this applications sounds bullshit to me? Latency isn't going to take hours, and wildfires detection can wait 20s
1
3
16
@Ar_Douillard
Arthur Douillard
6 days
@metaai Oh, and one point raised by the authors: this super lookahead is only the *outer-optimizer*, and can be perfectly combined with any *inner-optimizer*, such as AdamW or Muon :)
1
0
12
@Ar_Douillard
Arthur Douillard
6 days
Non-distributed DiLoCo as a super lookahead: Kalluski et al. from @metaai released a study of using Nesterov on outer gradients: https://t.co/cYupiYyQru The algo that they nicknamed SNOO is basically with DiLoCo with M=1, meaning that every K steps, a delta is computed between
5
12
78
@nathanbarrydev
Nathan Barry
12 days
Research Log Day 0: DiLoCo Days I decided to a thesis around distributed low-communication training. Essentially, how can we train large models efficiently across distributed nodes and not be utterly destroyed by network latency and bandwidth? (1/n)
1
1
6
@Ar_Douillard
Arthur Douillard
11 days
True story when scaling to many many nodes
@tenderizzation
tender
12 days
two DiLoCo nodes exchanging gradients that cancel out
0
0
4
@Google
Google
12 days
Today, @GoogleResearch announced DeepSomatic, a new machine learning model developed with our partners, including @ucscgenomics and @ChildrensMercy, that accurately identifies genetic variants in cancer cells — a critical step for delivering more precise treatments for patients.
Tweet card summary image
blog.google
An overview of DeepSomatic, a new AI tool that helps identify complex genetic variants in cancer cells.
98
280
2K
@Ar_Douillard
Arthur Douillard
12 days
Learned today that a startup is using Streaming DiLoCo to train a distributed AlphaFold-like model. Happy :)
2
0
24
@pfau
David Pfau
12 days
Very excited to be able to talk about something I've been working on for a while now - we're working with Commonwealth Fusion Systems, IMO the leading fusion startup in the world, to take our work on AI and tokamaks and make it work at the frontier of fusion energy.
@GoogleDeepMind
Google DeepMind
12 days
We’re announcing a research collaboration with @CFS_energy, one of the world’s leading nuclear fusion companies. Together, we’re helping speed up the development of clean, safe, limitless fusion power with AI. ⚛️
32
62
1K
@deredleritt3r
prinz
13 days
Google and Yale scientists have trained an LLM that has generated a novel hypothesis about cancer cellular behavior. This prediction was confirmed multiple times in vitro. - "What made this prediction so exciting was that it was a novel idea. Although CK2 has been implicated in
@sundarpichai
Sundar Pichai
13 days
An exciting milestone for AI in science: Our C2S-Scale 27B foundation model, built with @Yale and based on Gemma, generated a novel hypothesis about cancer cellular behavior, which scientists experimentally validated in living cells.  With more preclinical and clinical tests,
39
154
2K