̷̨̨̛̛̲͎̣̞̘͈̯͚͂̈́̂̄̽̎́̽̔͑̄̏̽̏͒̾́̅̐̈́̾̎̆͆̽́͌̽̀̚̕̚̕͠͝͝
@Quantum_Stat
Followers
2K
Following
542
Media
334
Statuses
4K
҈̿ ̶̢̢̧̡̼̜̝̬͍̜̘̙͉̘͎͓͍̣̰͖̹͖͚̭̘̖̟͕̬̠̬͇̹̮͎̣̱̎̾͌̊́̇͛̅͂̀̀͑̃̈̀̓̏͌͌͋̐̾̒͋͋̏͋̽̈́͐͑̐̂͆̊̈́̾͌̓͌̕̚͝͝͠͠͝ͅ Repos: https://t.co/3x0DctKuld
Joined March 2018
🚀🚀 Super excited to share the latest benchmark results for our quantized BGE models. A few weeks ago, these models were introduced with the aim of enhancing performance and efficiency for generating embeddings. And we've now conducted thorough comparisons between running
0
2
10
I love the #ChatGPT Cheat Sheet by Ricky Costa (@Quantum_Stat) which includes 🔹NLP Tasks 🔹Code 🔹Structured Output Styles 🔹Unstructured Output Styles 🔹Media Types 🔹Meta ChatGPT 🔹Expert Prompting Get your hands on this amazing resource at: https://t.co/Bg1roxcMFO
0
1
4
0
0
0
Check the image below for an example of what I'm discussing 👇 We are soon releasing a notebook with an end-to-end example for anyone to replicate the compressed bge models which achieve great accuracy results on the MTEB Leaderboard.
1
0
0
The .npz file is a dictionary, with keys mapping to input names in the ONNX spec and values as NumPy arrays filled with the data. - Keep all data samples in a directory, usually named "data."
1
0
0
📦 The NPZ files will be used by Sparsify to calibrate the ONNX model by using samples from a calibration dataset. 🔍 **Specifications**: - Each .npz file houses a single data sample, no batch dimension. This data sample takes a thrilling ride through the ONNX model. -
1
0
0
⚡Getting to Know the NPZ file format to Compress BGE Embedding Models ⚡ For One-Shot Quantization (INT8), Sparsify relies on the .npz format for data storage, a file format rooted in the mighty NumPy library.
2
1
4
source library Sparsify! Not only is it ONNX and INT8 quantized (faster and lighter) but is able to run on CPUs using DeepSparse! 💥 cc @neuralmagic Model:
huggingface.co
0
0
7
⚡IT HAPPENED!⚡ There's a new state-of-the-art sentence embeddings model for the semantic textual similarity task on Hugging Face's MTEB leaderboard 🤗! Bge-large-en-v1.5-quant was the model I quantized in less than an hour using a single CLI command using Neural Magic's open
3
6
45
Exciting News! 🚀 DeepSparse is now integrated with @LangChainAI , opening up a world of possibilities in Generative AI on CPUs. Langchain, known for its innovative design paradigms for large language model (LLM) applications, was often constrained by expensive APIs or cumbersome
0
6
22
🌟First, want to thank everyone for pushing this model past 1,000 downloads in only a few days!! Additionally, I added bge-base models to MTEB. Most importantly, code snippets were added for running inference in the model cards for everyone to try out! https://t.co/NZO7DPGubb
huggingface.co
0
0
5
🚀🚀 Explore Sparsify's One-Shot Experiment Guide! Discover how to quickly optimize your models with post-training algorithms for a 3-5x speedup. Perfect for when you need to sparsify your model with limited time and improved inference speedups.🔥 **FYI, this is what I used to
0
1
5
🚀🚀 Hey, check out our blog on @huggingface 🤗regarding running LLMs on CPUs! The blog discusses how researchers at IST Austria & Neural Magic have cracked the code for fine-tuning large language models. The method, combining sparse fine-tuning and distillation-type losses,
0
1
6
🚀✨ Run CodeGen on CPUs with this detailed Colab notebook! 📝 Explore how to sparsify and perform Large Language Model (LLM) inference using Neural Magic's stack, featuring Salesforce/codegen-350M-mono as an example. Dive into these key steps: 1️⃣ **Installation**: Quickly set
0
1
3