
Roberto López Castro
@RobertoL_Castro
Followers
104
Following
140
Media
1
Statuses
48
Costa da Morte - A Coruña - Wien | Postdoc Researcher @ISTAustria | Residency @RedHat_AI
Joined February 2013
RT @DAlistarh: Announcing our early work on FP4 inference for LLMs! .- QuTLASS: low-precision kernel support for Blackwell GPUs.- FP-Quant:….
0
37
0
RT @DAlistarh: We are introducing Quartet, a fully FP4-native training method for Large Language Models, achieving optimal accuracy-efficie….
0
78
0
RT @DAlistarh: Introducing MoE-Quant, a fast version of GPTQ for MoEs, with:.* Optimized Triton kernels and expert&data parallelism.* Quant….
0
32
0
RT @DAlistarh: Our QuEST paper was selected for Oral Presentation at ICLR @sparseLLMs workshop! .QuEST is the first algorithm with Pareto-o….
github.com
Work in progress. Contribute to IST-DASLab/QuEST development by creating an account on GitHub.
0
9
0
RT @spcl_eth: Yesterday, Jiale and Roberto presented the paper MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language….
0
1
0
RT @rush_tabesh: Happy to introduce #HALO lower-precision fine-tuning for LLMs. With proper Hadamard transforms, #HALO enables accurate INT….
0
3
0
RT @AshkboosSaleh: Happy to release #HALO, a Hadamard-Assisted Lower-Precision scheme that enables INT8/FP6 full #finetuning (FFT) of LLMs.….
0
9
0
RT @RedHat: Today, Red Hat completed the acquisition of @NeuralMagic, a pioneer in software and algorithms that accelerate #GenAI inference….
0
39
0
RT @_EldarKurtic: 4) The 2:4 models are compatible with quantization. We apply GPTQ to quantize weights to 4-bit integers, but modify it su….
0
1
0
RT @_EldarKurtic: 2:4 Sparsity + @AIatMeta Llama-3.1: At @neuralmagic, we've developed a recipe to produce very competitive sparse LLMs, a….
0
26
0
RT @neuralmagic: Sparse-Marlin is here and integrated into @vllm_project! This GPU-optimized kernel accelerates matrix multiplication with….
0
20
0
RT @DAlistarh: Happy to release the write-up on the MARLIN kernel for fast LLM inference, now supporting 2:4 sparsity! .Led by @elias_fran….
github.com
Boosting 4-bit inference kernels with 2:4 Sparsity - IST-DASLab/Sparse-Marlin
0
22
0
RT @BiblioInf_UDC: 📰Xa dispoñible no #RUC e #Zenodo o traballo do grupo #GAC da @FIC_UDC: "STuning-DL: Model-Driven Autotuning of Sparse GP….
0
2
0
RT @spcl_eth: Don't miss Roberto's talk at #SC23 tomorrow at 11:30AM as he unveils VENOM - a new sparse matrix format enabling arbitrary N:….
sc23.supercomputing.org
When, What, & How? WHEN The Digital Experience will go live a few days before the conference begins. We’re shooting for November 1. What Attendees can enjoy live-stream content and Q&A directly from...
0
2
0
Join us at #SC23 on Nov 16 at 11:30AM as we unveil VENOM - a new sparse matrix format enabling arbitrary N:M patterns on Sparse Tensor Cores, natively restricted to 2:4. Plus, explore Spatha 🗡️, our sparse library for VENOM achieving up to 37x over SOTA dense methods.
0
2
4
RT @AiBreakfast: 🤯 Full body tracking now possible using only WiFi signals. A deep neural network maps the phase and amplitude of WiFi sign….
0
1K
0