mehdi cherti @mehdidc X Profile

mehdi cherti

@mehdidc

Followers

344

Following

1K

Media

38

Statuses

732

PostDoc at Jülich Supercomputing Center (JSC), Germany / LAION.

https://t.co/qWSTdVAVMq

Cologne, Germany

Joined January 2010

Don't wanna be here? Send us removal request.

Loubna Ben Allal

@LoubnaBenAllal1

11 days

After ~4 years building SOTA models & datasets, we're sharing everything we learned in ⚡The Smol Training Playbook We cover the full LLM cycle: designing ablations, choosing an architecture, curating data, post-training, and building solid infrastructure. We'll help you

35

158

1K

Ross Wightman

@wightmanr

5 months

timm's got a new vision transformer (NaFlexVit), and it's flexible! I've been plugging away at this for a bit, integrating ideas from FlexiViT, NaViT, and NaFlex and finally ready to merge for initial exploration. The model supports: * variable aspect/size images of NaFlex (see

5

38

232

Jenia Jitsev 🏳️‍🌈 🇺🇦 🇮🇱

@JJitsev

5 months

With great help by @gpuccetti92, @tommiekerssies and @rom1504 . More to come.

0

1

2

Jenia Jitsev 🏳️‍🌈 🇺🇦 🇮🇱

@JJitsev

5 months

When all of the sudden puzzle pieces fall into right places and predicting the unknown starts to work, those are rare beautiful moments I am grateful for in science. Made by rare minds of @marnezhurina @tomerporian @mehdidc https://t.co/DiuR6g8pil https://t.co/snYXA7AcRg

arxiv.org

In studies of transferable learning, scaling laws are obtained for various important foundation models to predict their properties and performance at larger scales. We show here how scaling law...

1

10

25

Ludwig Schmidt

@lschmidt3

5 months

Very excited to finally release our paper for OpenThoughts! After DataComp and DCLM, this is the third large open dataset my group has been building in collaboration with the DataComp community. This time, the focus is on post-training, specifically reasoning data.

22

209

1K

Chelsea Finn

@chelseabfinn

7 months

Introducing π-0.5! The model works out of the box in completely new environments. Here the robot cleans new kitchens & bedrooms. 🤖 Detailed paper + videos in more than 10 unseen rooms: https://t.co/64ze7ToQup A short thread 🧵

15

101

703

Vishaal Udandarao

@vishaal_urao

7 months

🚀New Paper! https://t.co/cZZGbeVrgR Everyone’s celebrating rapid progress in math reasoning with RL/SFT. But how real is this progress? We re-evaluated recently released popular reasoning models—and found reported gains often vanish under rigorous testing!! 👀 🧵👇

4

55

265

Thomas Wolf

@Thom_Wolf

9 months

After 6+ months in the making and burning over a year of GPU compute time, we're super excited to finally release the "Ultra-Scale Playbook" Check it out here: https://t.co/mnC0UzZYsJ A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels,

110

706

4K

Thomas Wolf

@Thom_Wolf

9 months

Finally took time to go over Dario's essay on DeepSeek and export control and to be honest it was quite painful to read. And I say this as a great admirer of Anthropic and big user of Claude* The first half of the essay reads like a lengthy attempt to justify that closed-source

109

495

3K

Vishal Misra

@vishalmisra

10 months

My thoughts on DeepSeek sent to a reporter asking for comments

19

95

514

Retropolis

@retropolisart

10 months

Directed by David Lynch.

154

13K

58K

Variety

@Variety

10 months

Director-writer David Lynch, who radicalized American film with with a dark, surrealistic artistic vision in films like “Blue Velvet” and “Mulholland Drive” and network television with “Twin Peaks,” has died. He was 78. https://t.co/T2GOao28ux

864

16K

54K

Frank Hutter

@FrankRHutter

10 months

The data science revolution is getting closer. TabPFN v2 is published in Nature: https://t.co/Ybb15pnZ5P On tabular classification with up to 10k data points & 500 features, in 2.8s TabPFN on average outperforms all other methods, even when tuning them for up to 4 hours🧵1/19

36

251

1K

Sam Rodriques

@SGRodriques

1 year

Introducing PaperQA2, the first AI agent that conducts entire scientific literature reviews on its own. PaperQA2 is also the first agent to beat PhD and Postdoc-level biology researchers on multiple literature research tasks, as measured both by accuracy on objective benchmarks

79

767

3K

Vishaal Udandarao

@vishaal_urao

1 year

Ever feel frustrated when you vaguely know what paper you want to cite but can't find it on Google? Can LM-based agents automatically find paper citations for you? Our new paper presents a tough new benchmark for this task along with an LM-based agent for finding citations.

Ori Press

@ori_press

1 year

Can AI help you cite papers? We built the CiteME benchmark to answer that. Given the text: "We evaluate our model on [CITATION], a dataset consisting of black and white handwritten digits" The answer is: MNIST CiteME has 130 questions; our best agent gets just 35.3% acc (1/5)🧵

1

3

27

Tomer Porian

@tomerporian

1 year

🧵1/8 We resolve the discrepancy between the compute optimal scaling laws of Kaplan (exponent 0.88, Figure 14, left) et al. and Hoffmann et al. (“Chinchilla”, exponent 0.5). Paper: https://t.co/QKFbNl4J9t Data + Code: https://t.co/N3p0Xg0THH

6

33

171

Shyamgopal Karthik

@ShyamgopalKart1

1 year

Do you want to improve the performance of your text-to-image model without any training? That too by just looking for a better initialization noise? Sounds too good to be true? 🧵👇 https://t.co/b8jbASFiXE

Luca Eyring

@LucaEyring

1 year

Can we enhance the performance of T2I models without any fine-tuning? We show that with our ReNO, Reward-based Noise Optimization, one-step models consistently surpass the performance of all current open-source Text-to-Image models within the computational budget of 20-50 sec!

2

9

72

Tim Brooks

@_tim_brooks

2 years

"fly through tour of a museum with many paintings and sculptures and beautiful works of art in all styles" Video generated by #Sora

298

553

3K

Massimo

@Rainmaker1973

2 years

Beluga whales love to play, scare, joke and generally interact with humans. This compilation is a good example. [📹 aquariumadvicesa] https://t.co/P2ifmVPKqm

342

9K

102K

Alexandre Gramfort

@agramfort

2 years

For those of you who were wondering what I’ve been doing since I joined @Meta Reality Labs late 2022. Here is the first detailed scientific communication about our work. You can read the paper at:

biorxiv.org

Since the advent of computing, humans have sought computer input technologies that are expressive, intuitive, and universal. While diverse modalities have been developed, including keyboards, mice,...

David Sussillo

@SussilloDavid

2 years

1/7 For the past decade, our team at Meta Reality Labs (previously CTRL-labs) has been dedicated to developing a neuromotor interface. Our goal is to address the Human Computer Interaction challenge of providing effortless, intuitive, and efficient input to computers.

5

24

168