
bagel.com
@bageldotcom
Followers
12K
Following
2K
Media
210
Statuses
1K
open source superintelligence
San Francisco, CA
Joined June 2023
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
128
151
896
BREAKING 🚨 Bagel Labs launch Paris - world's first decentralized trained open-weight diffusion model open for research and commercial use. - comparable quality to SOTA using 14× less data and 16× less compute. - combination of smaller expert diffusion models pre-trained from
The results. These images came from 8 experts that never spoke to each other during training. We believe if we can scale this approach, this is the first real step towards open source superintelligence. But that requires solving some more really really hard problems. If you're
2
6
37
oh boy my ai generated girlfriends are going to look even more realistic now
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
2
4
39
community sourced, decentralized compute is very necessary for shared information development propagation in a non-data-leaky manner! something like this coupled with diloco should be next!
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
0
3
9
A nice surprise! Fully open source reproduction of Decentralized Diffusion Models. Congrats to the team!
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
3
4
16
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
128
151
896
You should be supporting every company that is fighting the fight for open source agi Very excited for this launch
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
3
11
119
Excited to share what I’ve been working on for the past two months - decentralized diffusion models pre-trained entirely in isolation. They outperform monolithic training under the same conditions and reach comparable FID to the DDM paper using 14x less data and 16x less compute!
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
1
4
8
Revolutions need first principle thinking. that's what we did with Paris. instead of incremental improvements, we build an entirely new distributed learning stack from scratch that removes the communication bottleneck entirely. This is the spaceX moment for decentralized AI.
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
8
6
30
Paris looks like the first fully decentralized trained diffusion model with open weights!! Eight experts learned in isolation with zero sync, then a tiny DiT router picks the best pair at inference. Big win for open source and elastic scale!
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
3
6
30
Paris by @bageldotcom is the world’s first decentralized trained diffusion model. (8 expert diffusion models ; ;458M-675M parameters each) 14× less data, 16× less compute — fully open-source and commercially usable. A proof that the open-source community can build together what
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
4
12
126
This is HUGE!! Congrats to Bidhan and the whole team at bagel 😎🥯🔥❤️
Introducing Paris - world's first decentralized trained open-weight diffusion model. We named it Paris after the city that has always been a refuge for those creating without permission. Paris is open for research and commercial use.
7
4
48
This model has all of the vaporwave saturation that I miss from the 23-24 era of image generation model aesthetics. Bagel team cooked.
Paris does something that shouldn't work. It's a combination of smaller expert diffusion models pre-trained from scratch, across different continents in complete isolation. Absolutely zero synchronization among each other during training. This zero communication protocol
1
4
6
The results. These images came from 8 experts that never spoke to each other during training. We believe if we can scale this approach, this is the first real step towards open source superintelligence. But that requires solving some more really really hard problems. If you're
3
1
61
The numbers. Paris achieved comparable results to SOTA decentralized approaches while using: 14× less training data (11M vs 158M images) 16× less compute (120 A40 GPU-days vs ~1176 A100-days) Paris also wins against monolithic training baselines. Our Top-2 routing on DiT-B/2
1
1
38
Here's what we did differently. Distributed training typically uses parallelism techniques like data parallelism, pipeline parallelism, model parallelism etc. All require synchronization between compute nodes. We removed this requirement entirely with Paris through decentralized
1
1
53
Paris does something that shouldn't work. It's a combination of smaller expert diffusion models pre-trained from scratch, across different continents in complete isolation. Absolutely zero synchronization among each other during training. This zero communication protocol
2
6
88