siyan_zhao Profile Banner
Siyan Zhao Profile
Siyan Zhao

@siyan_zhao

Followers
3K
Following
886
Media
45
Statuses
154

CS PhD student @UCLA | Bachelors @UofT EngSci | LLMs, generative models, decision-making

Los Angeles, CA
Joined January 2019
Don't wanna be here? Send us removal request.
@siyan_zhao
Siyan Zhao
3 months
Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.
Tweet media one
8
105
568
@siyan_zhao
Siyan Zhao
14 days
RT @tungnd_13: 🚀 Introducing PhysiX: One of the first large-scale foundation models for physics simulations!. PhysiX is a 4.5B parameter mo….
0
257
0
@siyan_zhao
Siyan Zhao
21 days
RT @sansa19739319: 🤖Can diffusion models write code competitively?.Excited to share our latest 7B coding diffusion LLM!!💻. With DiffuCoder,….
0
112
0
@siyan_zhao
Siyan Zhao
22 days
RT @li78658171: (1/6)Our work Reflect-DiT was accepted to #ICCV2025 !.Reflect-DiT allows the model to reflect on its past generations and t….
0
23
0
@siyan_zhao
Siyan Zhao
1 month
RT @Xinyu2ML: 🚀 Super excited to share Multiverse!. 🏃 It’s been a long journey exploring the space between model design and hardware effici….
0
19
0
@siyan_zhao
Siyan Zhao
2 months
RT @hbXNov: 🧑‍🍳Very excited to present LaViDa, one of the first diffusion language models for multimodal understanding! . 🌟Unlike autoregre….
0
30
0
@siyan_zhao
Siyan Zhao
2 months
RT @li78658171: 📢(1/11)Diffusion LMs are fast and controllable at inference time! But why restrict such benefits for processing text data?….
0
40
0
@siyan_zhao
Siyan Zhao
3 months
RT @SonglinYang4: starting now
Tweet media one
0
2
0
@siyan_zhao
Siyan Zhao
3 months
RT @ma_chang_nlp: We are kicking off a series of seminars at @hkunlp2020. @siyan_zhao will be giving a talk titled "d1: Scaling Reasoning….
0
13
0
@siyan_zhao
Siyan Zhao
3 months
RT @HuangZi71008374: 🔬 Checkout our newest #ICML’25 spotlight paper GREAT, which revolutionizes GraphODEs with better generalization throug….
0
17
0
@siyan_zhao
Siyan Zhao
3 months
RT @qqyuzu: A nice and clean implementation based on huggingface TRL!.
0
3
0
@siyan_zhao
Siyan Zhao
3 months
Our Diffu-GRPO and evaluation code is now released! Check it out at
Tweet media one
@siyan_zhao
Siyan Zhao
3 months
Introducing d1🚀 — the first framework that applies reinforcement learning to improve reasoning in masked diffusion LLMs (dLLMs). Combining masked SFT with a novel form of policy gradient algorithm, d1 significantly boosts the performance of pretrained dLLMs like LLaDA.
Tweet media one
2
19
131
@siyan_zhao
Siyan Zhao
3 months
RT @adityagrover_: Thank you @VentureBeat for covering our research on enhancing reasoning with diffusion LLMs using d1. Great collaborati….
0
3
0
@siyan_zhao
Siyan Zhao
3 months
RT @hbXNov: ✈️ I will be at @iclr_conf 🇸🇬 to present the following work on LLM reasoning, vision-language understanding, and LLM evaluatio….
0
8
0
@siyan_zhao
Siyan Zhao
3 months
RT @linkaixi: Attending #ICLR2025 from 4/23 to 4/28 & will present PrefEval ( discussing the performance of SoTA….
0
2
0
@siyan_zhao
Siyan Zhao
3 months
RT @tungnd_13: Large language models (LLMs) have been explored for optimization via prompting to evaluate or improve candidate solutions. H….
0
15
0
@siyan_zhao
Siyan Zhao
3 months
RT @ykilcher: 📅Saturday Night Paper Discussion📅.Join us tonight to talk about d1: Scaling Reasoning in Diffusion Large Language Models via….
0
15
0
@siyan_zhao
Siyan Zhao
3 months
RT @iScienceLuvr: d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning. "we propose d1, a framework to adapt….
0
69
0
@siyan_zhao
Siyan Zhao
3 months
8/n.Find our more:.Paper link: arxiv paper coming soon!).Code repo: (SFT code open-sourced, RL code coming soon!).Project Page: Awesome collaboration with @DevaanshGupta1, @qqyuzu , @adityagrover_ !.
1
4
18
@siyan_zhao
Siyan Zhao
3 months
7/n Compared with other state-of-the-art models, d1-LLaDA achieves competitive performance values compared to recent leading dLLMs and similar-sized AR LLMs.
Tweet media one
1
3
15