Roberto López Castro @RobertoL_Castro X Profile

Roberto López Castro

@RobertoL_Castro

Followers

104

Following

140

Media

1

Statuses

48

Costa da Morte - A Coruña - Wien | Postdoc Researcher @ISTAustria | Residency @RedHat_AI

Joined February 2013

Don't wanna be here? Send us removal request.

Roberto López Castro

@RobertoL_Castro

9 days

RT @DAlistarh: Announcing our early work on FP4 inference for LLMs! .- QuTLASS: low-precision kernel support for Blackwell GPUs.- FP-Quant:….

0

37

0

Roberto López Castro

@RobertoL_Castro

2 months

RT @DAlistarh: We are introducing Quartet, a fully FP4-native training method for Large Language Models, achieving optimal accuracy-efficie….

0

78

0

Roberto López Castro

@RobertoL_Castro

4 months

RT @DAlistarh: Introducing MoE-Quant, a fast version of GPTQ for MoEs, with:.* Optimized Triton kernels and expert&data parallelism.* Quant….

0

32

0

Roberto López Castro

@RobertoL_Castro

4 months

RT @DAlistarh: Our QuEST paper was selected for Oral Presentation at ICLR @sparseLLMs workshop! .QuEST is the first algorithm with Pareto-o….

github.com

Work in progress. Contribute to IST-DASLab/QuEST development by creating an account on GitHub.

0

9

0

Roberto López Castro

@RobertoL_Castro

5 months

RT @spcl_eth: Yesterday, Jiale and Roberto presented the paper MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language….

0

1

0

Roberto López Castro

@RobertoL_Castro

5 months

RT @rush_tabesh: Happy to introduce #HALO lower-precision fine-tuning for LLMs. With proper Hadamard transforms, #HALO enables accurate INT….

0

3

0

Roberto López Castro

@RobertoL_Castro

5 months

RT @AshkboosSaleh: Happy to release #HALO, a Hadamard-Assisted Lower-Precision scheme that enables INT8/FP6 full #finetuning (FFT) of LLMs.….

0

9

0

Roberto López Castro

@RobertoL_Castro

6 months

RT @RedHat: Today, Red Hat completed the acquisition of @NeuralMagic, a pioneer in software and algorithms that accelerate #GenAI inference….

0

39

0

Roberto López Castro

@RobertoL_Castro

8 months

RT @_EldarKurtic: 4) The 2:4 models are compatible with quantization. We apply GPTQ to quantize weights to 4-bit integers, but modify it su….

0

1

0

Roberto López Castro

@RobertoL_Castro

8 months

RT @_EldarKurtic: 2:4 Sparsity + @AIatMeta Llama-3.1: At @neuralmagic, we've developed a recipe to produce very competitive sparse LLMs, a….

0

26

0

Roberto López Castro

@RobertoL_Castro

11 months

RT @neuralmagic: Sparse-Marlin is here and integrated into @vllm_project! This GPU-optimized kernel accelerates matrix multiplication with….

0

20

0

Roberto López Castro

@RobertoL_Castro

11 months

RT @neuralmagic: Code: Paper: Made possible by: @DAlistarh, @elias_frantar, @RobertoL_Cas….

github.com

Boosting 4-bit inference kernels with 2:4 Sparsity - IST-DASLab/Sparse-Marlin

0

2

0

Roberto López Castro

@RobertoL_Castro

11 months

RT @DAlistarh: Happy to release the write-up on the MARLIN kernel for fast LLM inference, now supporting 2:4 sparsity! .Led by @elias_fran….

github.com

Boosting 4-bit inference kernels with 2:4 Sparsity - IST-DASLab/Sparse-Marlin

0

22

0

Roberto López Castro

@RobertoL_Castro

1 year

RT @BiblioInf_UDC: 📰Xa dispoñible no #RUC e #Zenodo o traballo do grupo #GAC da @FIC_UDC: "STuning-DL: Model-Driven Autotuning of Sparse GP….

0

2

0

Roberto López Castro

@RobertoL_Castro

1 year

RT @thoefler: Is @nvidia's 2:4 sparsity not enough for your ambitions with #AI models? Then check out Roberto's structured sparse extension….

0

19

0

Roberto López Castro

@RobertoL_Castro

2 years

RT @spcl_eth: Don't miss Roberto's talk at #SC23 tomorrow at 11:30AM as he unveils VENOM - a new sparse matrix format enabling arbitrary N:….

sc23.supercomputing.org

When, What, & How? WHEN The Digital Experience will go live a few days before the conference begins. We’re shooting for November 1. What Attendees can enjoy live-stream content and Q&A directly from...

0

2

0

Roberto López Castro

@RobertoL_Castro

2 years

Join us at #SC23 on Nov 16 at 11:30AM as we unveil VENOM - a new sparse matrix format enabling arbitrary N:M patterns on Sparse Tensor Cores, natively restricted to 2:4. Plus, explore Spatha 🗡️, our sparse library for VENOM achieving up to 37x over SOTA dense methods.

0

2

4

Roberto López Castro

@RobertoL_Castro

2 years

RT @spcl_eth: Join Roberto at #SC23 on Nov 16 at 11:30AM as he unveils VENOM - a new sparse matrix format enabling arbitrary N:M patterns o….

0

9

0

Roberto López Castro

@RobertoL_Castro

2 years

RT @docmilanfar: Sparsity people trying to publish in ML conferences

0

17

0

Roberto López Castro

@RobertoL_Castro

3 years

RT @AiBreakfast: 🤯 Full body tracking now possible using only WiFi signals. A deep neural network maps the phase and amplitude of WiFi sign….

0

1K

0