#TurboSparse X Hashtag

Explore tweets tagged as #TurboSparse

GPT Maestro | LLMpedia Curator

@GptMaestro

2 years

⭐️ From 𝗣𝗼𝘄𝗲𝗿𝗜𝗻𝗳𝗲𝗿-𝟮: 𝗙𝗮𝘀𝘁 𝗟𝗮𝗿𝗴𝗲 𝗟𝗮𝗻𝗴𝘂𝗮𝗴𝗲 𝗠𝗼𝗱𝗲𝗹 𝗜𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲 𝗼𝗻 𝗮 𝗦𝗺𝗮𝗿𝘁𝗽𝗵𝗼𝗻𝗲 (June 10, 2024): PowerInfer-2 runs the TurboSparse-Mixtral-47B model on a smartphone at 11.68 tokens/sec, achieving up to 29.2× speedup over existing

1

0

Holden

@hodlenx

2 years

🔑 Model sparsity is the key to PowerInfer-2, and TurboSparse makes it possible. We've pushed the FFN sparsity of Mistral and Mixtral to 90% and 97%, with even higher performance. Dive into the details at https://t.co/ChsDCZyxgI and get the models today: https://t.co/us1sgubiip.

2

0

13

Rohan Paul

@rohanpaul_ai

2 years

Decoding speeds of PowerInfer-2, llama.cpp, and MLC-LLM on TurboSparse-Mistral-7B with different offloading setups. “50% offload” means 50% model weights of FFN blocks are offloaded to flash storage. “No offload” means all model parameters are resident in memory. A red label of

0

2

9

Rohan Paul

@rohanpaul_ai

1 year

Model sparsity is the key to PowerInfer-2, and TurboSparse makes it possible, making it possible to run such a huge model on a Mobile phone. According to the Powerinfer-2 paper they have pushed the FFN sparsity of Mistral and Mixtral to 90% and 97%, with even higher performance.

0

11

Rohan Paul

@rohanpaul_ai

1 year

In their paper, the researchers introduced 2 models TurboSparse-Mistral-7B and TurboSparse-Mixtral-47B. 👨‍🔧 These models are sparsified versions of Mistral and Mixtral, respectively, ensuring not only enhanced model performance but also higher predictable sparsity. Notably,

0

6

Managetech inc.

@managetech_inc

2 years

中国からのこの AI 論文では、パフォーマンスを維持しながらモデルのスパース性を 90% まで高め、推論の 2 ～ 5 倍の高速化を実現する、新しい dReLU ベースのスパース化手法を提案しています - MarkTechPost #LLMs #ConditionalComputation #SparsityEfficiency #TurboSparse https://t.co/U9VasCDObB

0

Holden

@hodlenx

2 years

@IlyasHairline @wey_gu If the foundation models use these activation functions and are pretrained from scratch, it would be ideal. We have demonstrated their negligible loss/perplexity compared to SwiGLU, but the trained model exhibited very sparse FFNs in TurboSparse paper and

0

2

Initial J

@agentprompt

2 years

🚀 Introducing PowerInfer-2 from SJTU-IPADS Labs! Revolutionary LLM inference engine for mobile devices delivers a 47B model with a 29x speedup on smartphones! 🔍 Discover the innovations: heterogeneous computing, I/O-Compute pipelining, and TurboSparse with up to 97% sparsity!

Holden

@hodlenx

2 years

🚀 Excited to introduce PowerInfer-2: A game-changing LLM inference engine for mobile devices by the #PowerInfer team. It smoothly runs a 47B model with a staggering 29x speedup on smartphones! Watch our demo to see it in action! 🎥 Technical details at: https://t.co/7bx5EnzWCs

4

0

Holden

@hodlenx

2 years

@wey_gu The method we proposed in TurboSparse enables us to sparsify a mainstream foundation model within 150B token (5% of pretraining). The continuous training of Mistral and Mixtral costs us less than $0.1M. We hope other researchers found it a nice trade😎

0

1

2

Holden

@hodlenx

2 years

@wey_gu Those ground-breaking speedups are all based on intrinsic sparsity and depends on ReLU more or less. We have confirmed some alternatives, like ReLU^2, and dReLU proposed in TurboSparse. They are very promising but not adopted by mainstream LLMs yet. Retraining is still essential

2

3

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

2 years

@KhonaMikail @suchenzang @shxf0072 Yeah, TurboSparse?

1

0

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

2 years

@hodlenx This is a Turbosparse-Mixtral 47B-int4 demoed on that phone, right? Could you kindly provide the reference quantization? I don't see it, only half-precision weights

0

4