danielhanchen Profile Banner
Daniel Han Profile
Daniel Han

@danielhanchen

Followers
25K
Following
6K
Media
320
Statuses
2K

Building @UnslothAI. Finetune train LLMs faster. LLMs bug hunter. OSS package https://t.co/aRyAAgKOR7. YC S24. Prev ML at NVIDIA. Hyperlearn used by NASA.

San Francisco
Joined April 2016
Don't wanna be here? Send us removal request.
@danielhanchen
Daniel Han
5 months
We managed to fit Llama 3.1 8B < 15GB with GRPO! Experience the R1 "aha moment" for free on Colab!. Phi-4 14B also works with @UnslothAI & vLLM is now integrated allowing 20x faster inference!. LoRA with GRPO also just work!. 1. We removed double memory usage during vLLM serving.
@UnslothAI
Unsloth AI
5 months
You can now reproduce DeepSeek-R1's reasoning on your own local device!. Experience the "Aha" moment with just 7GB VRAM. Unsloth reduces GRPO training memory use by 80%. 15GB VRAM can transform Llama-3.1 (8B) & Phi-4 (14B) into reasoning models. Blog:
Tweet media one
58
286
2K
@danielhanchen
Daniel Han
5 days
You can utilize our Gemma 3n multimodal and fine-tuning Kaggle notebook for any submission to the $150,000 challenge!. The $10,000 is specifically for the Unsloth track - but you can submit it for the main track as well!. Kaggle notebook:
@UnslothAI
Unsloth AI
5 days
We’ve teamed up with @GoogleDeepMind for a challenge with a $10,000 Unsloth prize! 🦥. Show off your best fine-tuned Gemma 3n model using Unsloth, optimized for an impactful task. The entire hackathon has $150,000 prizes to be won!. Kaggle notebook:
Tweet media one
2
13
85
@danielhanchen
Daniel Han
5 days
RT @DynamicWebPaige: 🦥 Fine-tuning with @UnslothAI now supports Gemma 3n! ✨. Friendly reminder: the Gemma 3n models can understand not just….
0
15
0
@danielhanchen
Daniel Han
6 days
Gemma 3N quirks!. 1. Vision NaNs on float16.2. Conv2D weights are large FP16 overflows to infinity.3. Large activations fixed vs Gemma 3.4. 6-7 training losses: normal for multimodal?.5. Large nums in msfa_ffn_pw_proj.6. NaNs fixed in @UnslothAI . Details:
Tweet media one
@UnslothAI
Unsloth AI
6 days
You can now fine-tune Gemma 3n for free with our notebook!. Unsloth makes Google Gemma training 1.5x faster with 50% less VRAM and 5x longer context lengths - with no accuracy loss. Guide: GitHub: Colab:
9
33
297
@danielhanchen
Daniel Han
10 days
Huge thanks to everyone who attended our @Google & @UnslothAI Gemma developer meetup yesterday! 🦥 Was amazing meeting you all & thank you to @blueviggen for hosting the event with us. Thank you to the Google speakers: @DynamicWebPaige, Doug Reid, @imayank42, @GrmCameron and of
Tweet media one
3
6
84
@danielhanchen
Daniel Han
10 days
RT @DynamicWebPaige: 💎 Celebrating the official release of Gemma 3n with the inaugural Gemma Community meetup at @Google San Francisco, coh….
0
3
0
@danielhanchen
Daniel Han
12 days
Excited to see you all tomorrow for our Google Gemma & Unsloth developer meetup! 🦥. We'll be having @Grmcameron from @ArtificialAnlys and @DynamicWebPaige & more amazing talks!. Location has been updated so please check & if you need help please DM me!.
0
3
25
@danielhanchen
Daniel Han
13 days
r/LocalLlama is back!!.
@danielhanchen
Daniel Han
13 days
We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!.
5
5
84
@danielhanchen
Daniel Han
13 days
We need r/LocalLlama back :( Hopefully a good neutral moderator takes the reins asap!.
25
8
187
@danielhanchen
Daniel Han
16 days
Managed to mostly fix Mistral 3.2 tool calling for GGUF / transformers!. 1. 3.2 tool calling is different from 3.1.2. timedelta(days=1) (yesterday) changed with a if-else - supports 2024 to 2028 dates - so now word for word same sys prompt!.3. Made experimental FP8 quant as well!.
@UnslothAI
Unsloth AI
16 days
Mistral releases Small 3.2 (24B), a new update to their 3.1 model. 🔥. The model performs much better on 5-shot MMLU (CoT), instruction following and function/tool calling!. Run locally with FP8 or 16GB RAM using our Dynamic GGUFs with fixed chat template:
Tweet media one
6
5
76
@danielhanchen
Daniel Han
17 days
RT @dvilasuero: New tutorial: how to build a synthetic dataset with recent information and use it to fine tune with @UnslothAI . Check out….
0
6
0
@danielhanchen
Daniel Han
18 days
@Google @GoogleDeepMind RSVP at Lightning talk proposals should be 1 or no slides - can be about Gemma, Unsloth, Gemini, RL or anything about open source AI!.
1
0
8
@danielhanchen
Daniel Han
18 days
We're hosting an event on RL, GRPO, agents, LLM bugs & everything about Gemma 26th at @Google's SF office!. There are 3 @GoogleDeepMind talks, special announcements & we're accepting 3 minute lightning talk proposals!. Plus exclusive Unsloth merch!.RSVP
Tweet media one
@UnslothAI
Unsloth AI
18 days
We're teaming up with @Google for a Gemma developer meetup at Google's San Francisco office next Thursday, June 26! 🦥. • Join us & the Gemma team for live demos and talks .• Unsloth new RL notebook & roadmap.• Q&A + merch from us all. RSVP required:
Tweet media one
8
20
130
@danielhanchen
Daniel Han
19 days
All will use our Dynamic 2.5 methodology + our > 1M token high quality calibration dataset. FP8 has above "dynamic" ie online quantization and "static" ie offline quantization - we can also do both. We can also create all of the above, if people are interested.
2
0
8
@danielhanchen
Daniel Han
19 days
We're working on Unsloth dynamic vLLM quants! Which formats are most interesting to you?.
7
3
36
@danielhanchen
Daniel Han
24 days
RT @clattner_llvm: This is just me unapologetically nerd crushing on the @UnslothAI duo, legendary developers with a shared goal of democra….
0
23
0
@danielhanchen
Daniel Han
25 days
I'll be giving a talk on the 'Future of Reinforcement Learning and Training' at @AMD's 2025 Advancing AI event today! 👋. See you all at 2:25pm PT in Room 230A-C. Excited to chat and meet!
Tweet media one
2
5
83
@danielhanchen
Daniel Han
26 days
Get 2x faster for reward model serving and sequence classification inference through @UnslothAI!. Nice benchmarks Kyle!.
@corbtt
Kyle Corbitt
26 days
RL twitter, did you know you can use @UnslothAI to serve your RM, and it has 2x the throughput of vllm? I didn't either! Nice job @danielhanchen . cc @natolambert, whose github issue comment prompted me to benchmark.
Tweet media one
1
11
82
@danielhanchen
Daniel Han
26 days
Wow! @UnslothAI on the Nasdaq tower!🦥Thank you @Redpoint for naming Unsloth one of the top 100 most impactful and fastest-growing infra companies in their 2025 report. And it’s all thanks to you - the community! We truly appreciate it and couldn’t have done it without you all🥰
Tweet media one
16
9
244
@danielhanchen
Daniel Han
27 days
RT @reach_vb: Unsloth released optimised GGUFs for llama.cpp, LMStudio and Ollama as well 💥. Love the sheer execution speed of the communit….
0
36
0
@danielhanchen
Daniel Han
27 days
The Mistral team at it again with Magistral!. GRPO with edits:.1. Removed KL Divergence.2. Normalize by total length (Dr. GRPO style).3. Minibatch normalization for advantages.4. Relaxing trust region. Paper: Docs to run Magistral:
Tweet media one
@MistralAI
Mistral AI
27 days
Announcing Magistral, our first reasoning model designed to excel in domain-specific, transparent, and multilingual reasoning.
Tweet media one
Tweet media two
9
99
681