TheZachMueller Profile Banner
Zach Mueller Profile
Zach Mueller

@TheZachMueller

Followers
13K
Following
46K
Media
2K
Statuses
20K

Hardware nerd. Usually yelling at NCCL over things

Baltimore, MD
Joined April 2016
Don't wanna be here? Send us removal request.
@allen_ai
Ai2
18 hours
Introducing OlmoEarth 🌍, state-of-the-art AI foundation models paired with ready-to-use open infrastructure to turn Earth data into clear, up-to-date insights within hours—not years.
10
58
263
@StasBekman
Stas Bekman
15 hours
Here is an excellent article that explains the differences between Context Parallelism (Ring Attention) and Ulysses Sequence Parallelism (head parallelism) and how the 2 can be combined together for a 2D CP+SP https://t.co/GJT6OuhEUJ
1
14
119
@TheZachMueller
Zach Mueller
21 hours
The vibe of last cohort’s compute credits
1
0
18
@_lewtun
Lewis Tunstall
1 day
In the Smol Training Playbook, I tried to survey the state of popular post-training frameworks. Let me know if I missed any and I'll add them to the list!
18
13
180
@TheZachMueller
Zach Mueller
1 day
Ah yes let’s add a 22TB HDD into… here… oh no
9
0
33
@m_sirovatka
Matej Sirovatka
2 days
It's that time of the year again and we're coming with another @GPU_MODE competition! This time in collaboration with @nvidia focused on NVFP4. Focused on NVFP4 and B200 GPUs (thanks to @sestercegroup ) we'll release 4 problems over the following 3 months: 1. NVFP4 Batched GEMV
7
11
173
@TheZachMueller
Zach Mueller
2 days
This might be the first KVM I like. Switches between all 3 OS’s (mac, windows, Linux) on separate machines without bugs/issues and can do full display resolutions (144hz via DP etc) smoothly. I’ve tried a few and thoroughly impressed. I’ll have a link below (not affiliated)
2
1
13
@TheZachMueller
Zach Mueller
3 days
Related goal this week to… news 👀 Going to see if I can’t manage to get ~30Gbps purely off 2 USB-C -> Ethernet + 2 10GbE ports. Wish me luck
0
0
8
@TheZachMueller
Zach Mueller
3 days
General question: how many TB of storage before you consider yourself a data center? Asking for a friend
3
0
11
@TheZachMueller
Zach Mueller
4 days
Workhorse will hit its final form this week. More news soon 👀 (And become much less of an abominationTM)
1
0
6
@TheZachMueller
Zach Mueller
4 days
It's the FINAL DAY to sign up for the last cohort of Scratch to Scale! Admission ends at midnight EST, join now while you can: https://t.co/zf7HL3Co8y
0
4
6
@Alibaba_Qwen
Qwen
4 days
🎉 Qwen3-VL is now available on llama.cpp! Run this powerful vision-language model directly on your personal devices—fully supported on CPU, CUDA, Metal, Vulkan, and other backends. We’ve also released GGUF weights for all variants—from 2B up to 235B. Download and enjoy! 🚀 🤗
Tweet card summary image
huggingface.co
45
203
1K
@code_star
Cody Blakeney
5 days
@DimitrisPapail
Dimitris Papailiopoulos
5 days
@code_star CONVERT YOUR CODEBASES TO REAL NUMBERS, I REPEAT REAL NUMBERS
90
484
6K
@natolambert
Nathan Lambert
5 days
I'm convinced to try it asap, we should all try fp16, look at this plot man. FP16 is like perfect in error reduction. "This is precisely why switching to FP16 provides a fundamental solution. With its 10 mantissa bits, FP16 offers 8 times more precision (2^10 values vs. 2^7
25
43
653
@tenderizzation
tender
5 days
every decommissioned V100 coming out of retirement after hearing that the future of RL is fp16
11
27
441
@mervenoyann
merve
5 days
if you want to contribute to open-source but don't know where to begin and just want to use AI, please don't avoid writing AI comments to GH repository issues, not only you are taking maintainers' time but it's also misleading for other devs same goes for PRs, most
11
8
158
@xariusrke
xr-5 🐀
5 days
some personal news: i recently joined @NousResearch. excited to learn and do cool stuff with very smart people
22
6
149
@ClementDelangue
clem 🤗
6 days
Happy Halloween from Reachy Mini! You'll be able to 3D print these skins at home thanks to open-source
31
71
599
@TheZachMueller
Zach Mueller
6 days
Elie, our 🐐, wrote a banger with other HF folks. I beg, please read if nothing else. If faced with my course and this, please go read this.
@eliebakouch
elie
6 days
Training LLMs end to end is hard. Very excited to share our new blog (book?) that cover the full pipeline: pre-training, post-training and infra. 200+ pages of what worked, what didn’t, and how to make it run reliably https://t.co/iN2JtWhn23
2
2
66