PluralisHQ Profile Banner
Pluralis Research Profile
Pluralis Research

@PluralisHQ

Followers
8K
Following
612
Media
11
Statuses
97

Protocol Learning

Joined July 2024
Don't wanna be here? Send us removal request.
@PluralisHQ
Pluralis Research
1 month
We've reached a major milestone in fully decentralized training: for the first time, we've demonstrated that a large language model can be split and trained across consumer devices connected over the internet - with no loss in speed or performance.
Tweet media one
93
262
912
@PluralisHQ
Pluralis Research
3 days
RT @gensynai: Connect with fellow open-source developers, engage with leading minds in decentralized AI (DeAI) and machine learning, and en….
0
6
0
@PluralisHQ
Pluralis Research
10 days
RT @_AlexanderLong: Using beautiful Grafana dashboards for everything internally, so much nicer than Tensorboard. Wandb still good but does….
0
5
0
@PluralisHQ
Pluralis Research
1 month
@SameeraRamasin1 Now this is feasible, we will begin a live run of the first community Protocol Model - Pluralis’s Genesis Run - in the coming days. Registration here: EOT.
7
13
66
@PluralisHQ
Pluralis Research
1 month
@SameeraRamasin1 We also expect this to also have a major impact on centralized training, reducing the importance of high speed networking, making spot-instance based training feasible, and increasing inference speeds.
1
2
31
@PluralisHQ
Pluralis Research
1 month
@SameeraRamasin1 And it is the most significant milestone in our Protocol Learning Research Program.
1
0
25
@PluralisHQ
Pluralis Research
1 month
@SameeraRamasin1 This unlocks:. 1. Truly community-created and owned base models. 2. A new path to scaling base models beyond anything we have seen to date. It is a result that we set out to achieve almost exactly one year ago today.
1
1
31
@PluralisHQ
Pluralis Research
1 month
@SameeraRamasin1 What this means is, for the first time, individuals can pool globally distributed compute to train models far larger than they could alone. As the model is split across nodes, the only constraint is how much compute the protocol can gather.
1
1
21
@PluralisHQ
Pluralis Research
1 month
This is a very high-level summary. All details are in the pre-print (led by @SameeraRamasin1)
1
1
33
@PluralisHQ
Pluralis Research
1 month
By decomposing the high-rank embedding information, we can enforce a shared subspace across all blocks. We then only need to transmit a small set of coefficients between nodes.
1
1
20
@PluralisHQ
Pluralis Research
1 month
We found this property in EVERY transformer we analysed, across all parameter sizes and architectures. It’s obscured by the recursive addition of positional embeddings via residuals - but once found, it can be used to dramatically decrease communication overhead.
Tweet media one
2
1
28
@PluralisHQ
Pluralis Research
1 month
How is this possible? All transformer models - regardless of size or architecture - have a hidden property: the output projection weights of each block have low stable rank.
Tweet media one
1
1
25
@PluralisHQ
Pluralis Research
1 month
Protocol Models fix this: we place one transformer block on each device and compress the communication between blocks by over 100x without altering training dynamics. This enables multi-participant training, of very large models, over the internet at datacenter speeds.
1
0
29
@PluralisHQ
Pluralis Research
1 month
Why? Because the internet is 100–300× slower than datacenter connections. Compressing to compensate breaks the model's internal communication. Errors build up, training collapses.
Tweet media one
1
1
33
@PluralisHQ
Pluralis Research
1 month
What has been achieved? Today, training happens in datacenters, where models are spread over many GPUs with fast interconnects. Trying to replicate this over the internet causes huge slowdowns.
1
1
31
@PluralisHQ
Pluralis Research
1 month
This work proves a third path is viable between closed models and today’s open-weight releases, which remain centralized and unsustainable. Community-owned models are the only true open-source AI and open a new path to scale.
1
4
53
@PluralisHQ
Pluralis Research
1 month
Almost exactly 1 year ago today Pluralis set out to solve a very difficult problem. We are pleased to announce an update on that problem. We can now combine small consumer devices, connected via the internet, to train giant models. Full paper release in 72 hours.
Tweet media one
29
28
214
@PluralisHQ
Pluralis Research
2 months
RT @_AlexanderLong: Probably biggest week in Decentralized Training to date off back of ICLR and more about to come out. Summary of situati….
0
32
0
@PluralisHQ
Pluralis Research
2 months
Full paper
2
1
17
@PluralisHQ
Pluralis Research
2 months
The last of our three ICLR workshop papers: Compression in pipeline parallel training has struggled to go beyond 10% compression without hurting model performance. We get 90%.
10
10
55