Explore tweets tagged as #TurboSparse
@GptMaestro
GPT Maestro | LLMpedia Curator
2 years
โญ๏ธ From ๐—ฃ๐—ผ๐˜„๐—ฒ๐—ฟ๐—œ๐—ป๐—ณ๐—ฒ๐—ฟ-๐Ÿฎ: ๐—™๐—ฎ๐˜€๐˜ ๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น ๐—œ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ผ๐—ป ๐—ฎ ๐—ฆ๐—บ๐—ฎ๐—ฟ๐˜๐—ฝ๐—ต๐—ผ๐—ป๐—ฒ (June 10, 2024): PowerInfer-2 runs the TurboSparse-Mixtral-47B model on a smartphone at 11.68 tokens/sec, achieving up to 29.2ร— speedup over existing
1
0
0
@hodlenx
Holden
2 years
๐Ÿ”‘ Model sparsity is the key to PowerInfer-2, and TurboSparse makes it possible. We've pushed the FFN sparsity of Mistral and Mixtral to 90% and 97%, with even higher performance. Dive into the details at https://t.co/ChsDCZyxgI and get the models today: https://t.co/us1sgubiip.
2
0
13
@rohanpaul_ai
Rohan Paul
2 years
Decoding speeds of PowerInfer-2, llama.cpp, and MLC-LLM on TurboSparse-Mistral-7B with different offloading setups. โ€œ50% offloadโ€ means 50% model weights of FFN blocks are offloaded to flash storage. โ€œNo offloadโ€ means all model parameters are resident in memory. A red label of
0
2
9
@rohanpaul_ai
Rohan Paul
1 year
Model sparsity is the key to PowerInfer-2, and TurboSparse makes it possible, making it possible to run such a huge model on a Mobile phone. According to the Powerinfer-2 paper they have pushed the FFN sparsity of Mistral and Mixtral to 90% and 97%, with even higher performance.
0
0
11
@rohanpaul_ai
Rohan Paul
1 year
In their paper, the researchers introduced 2 models TurboSparse-Mistral-7B and TurboSparse-Mixtral-47B. ๐Ÿ‘จโ€๐Ÿ”ง These models are sparsified versions of Mistral and Mixtral, respectively, ensuring not only enhanced model performance but also higher predictable sparsity. Notably,
0
0
6
@managetech_inc
Managetech inc.
2 years
ไธญๅ›ฝใ‹ใ‚‰ใฎใ“ใฎ AI ่ซ–ๆ–‡ใงใฏใ€ใƒ‘ใƒ•ใ‚ฉใƒผใƒžใƒณใ‚นใ‚’็ถญๆŒใ—ใชใŒใ‚‰ใƒขใƒ‡ใƒซใฎใ‚นใƒ‘ใƒผใ‚นๆ€งใ‚’ 90% ใพใง้ซ˜ใ‚ใ€ๆŽจ่ซ–ใฎ 2 ๏ฝž 5 ๅ€ใฎ้ซ˜้€ŸๅŒ–ใ‚’ๅฎŸ็พใ™ใ‚‹ใ€ๆ–ฐใ—ใ„ dReLU ใƒ™ใƒผใ‚นใฎใ‚นใƒ‘ใƒผใ‚นๅŒ–ๆ‰‹ๆณ•ใ‚’ๆๆกˆใ—ใฆใ„ใพใ™ - MarkTechPost #LLMs #ConditionalComputation #SparsityEfficiency #TurboSparse https://t.co/U9VasCDObB
0
0
0
@hodlenx
Holden
2 years
@IlyasHairline @wey_gu If the foundation models use these activation functions and are pretrained from scratch, it would be ideal. We have demonstrated their negligible loss/perplexity compared to SwiGLU, but the trained model exhibited very sparse FFNs in TurboSparse paper and
0
0
2
@agentprompt
Initial J
2 years
๐Ÿš€ Introducing PowerInfer-2 from SJTU-IPADS Labs! Revolutionary LLM inference engine for mobile devices delivers a 47B model with a 29x speedup on smartphones! ๐Ÿ” Discover the innovations: heterogeneous computing, I/O-Compute pipelining, and TurboSparse with up to 97% sparsity!
@hodlenx
Holden
2 years
๐Ÿš€ Excited to introduce PowerInfer-2: A game-changing LLM inference engine for mobile devices by the #PowerInfer team. It smoothly runs a 47B model with a staggering 29x speedup on smartphones! Watch our demo to see it in action! ๐ŸŽฅ Technical details at: https://t.co/7bx5EnzWCs
4
0
0
@hodlenx
Holden
2 years
@wey_gu The method we proposed in TurboSparse enables us to sparsify a mainstream foundation model within 150B token (5% of pretraining). The continuous training of Mistral and Mixtral costs us less than $0.1M. We hope other researchers found it a nice trade๐Ÿ˜Ž
0
1
2
@hodlenx
Holden
2 years
@wey_gu Those ground-breaking speedups are all based on intrinsic sparsity and depends on ReLU more or less. We have confirmed some alternatives, like ReLU^2, and dReLU proposed in TurboSparse. They are very promising but not adopted by mainstream LLMs yet. Retraining is still essential
2
2
3
@teortaxesTex
Teortaxesโ–ถ๏ธ (DeepSeek ๆŽจ็‰น๐Ÿ‹้“็ฒ‰ 2023 โ€“ โˆž)
2 years
@KhonaMikail @suchenzang @shxf0072 Yeah, TurboSparse?
1
0
0
@teortaxesTex
Teortaxesโ–ถ๏ธ (DeepSeek ๆŽจ็‰น๐Ÿ‹้“็ฒ‰ 2023 โ€“ โˆž)
2 years
@hodlenx This is a Turbosparse-Mixtral 47B-int4 demoed on that phone, right? Could you kindly provide the reference quantization? I don't see it, only half-precision weights
0
0
4