
Rabiul Awal
@_rabiulawal
Followers
368
Following
17K
Media
40
Statuses
1K
phd student @Mila_Quebec generative models, world models.
Montréal, QC, CAN
Joined February 2015
🔥 New work on AR-only image editing with RL!.AR models usually trail diffusion, but our AR+RL (EARL) shows they can outperform the best diffusion baselines. We dive deep: SFT → RL → reasoning, showing a full pipeline for AR image editing. Details in the thread 👇.
We built a new 𝗮𝘂𝘁𝗼𝗿𝗲𝗴𝗿𝗲𝘀𝘀𝗶𝘃𝗲 + 𝗥𝗟 image editing model using a strong verifier — and it beats SOTA diffusion baselines using 5× less data. 🔥 𝗘𝗔𝗥𝗟: a simple, scalable RL pipeline for high-quality, controllable edits. 🧵1/
2
5
18
RT @DimitrisPapail: Why is cross-entropy a good loss for language pretraining?. caveat: this is all known btw; interestingly, even though t….
0
21
0
RT @abeirami: The main ingredient that led to GRPO's performance leap is the calibration of the reward/value via multiple rollouts per prom….
0
45
0
RT @pcastr: Another great and thought-provoking keynote by @jpineau1 at @RL_Conference , questioning what we mean (and what we should mean)….
0
12
0
RT @_rockt: Harder, Better, Faster, Stronger, Real-time! We are excited to reveal Genie 3, our most capable real-time foundational world mo….
0
190
0
We didn’t have 3,000 authors like GDM, but we collaborated at full force!.@Saba_A96 brought this tough project to life. @benno_krojer wrote my favorite intro ever. @aa_kazemnejad added the RL cherry on top — and it worked!.@sikarwar_ank crushed it from ideation to execution.
1
2
6
RT @ZadaianchukML: 🚀 We’re excited to announce our #CoRL2025 workshop: Learning to Simulate Robot Worlds.Spanning high-fidelity simulators,….
0
13
0
RT @ClementDelangue: Every tech company can and should train their own deepseek R1, Llama or GPT5, just like every tech company writes thei….
0
279
0
RT @chelseabfinn: EXPO is a new method for RL post-training of diffusion policies. With good pre-training, it’s both really stable and ~2x….
0
50
0
RT @niloofar_mire: 🧵 Academic job market season is almost here! There's so much rarely discussed—nutrition, mental and physical health, unc….
0
39
0
RT @rmsnorm: 🤔 Why do we extract diffusion features from noisy images? Isn’t that destroying information?. Yes, it is - but we found a way….
0
43
0
RT @lavoiems: DLCs enables exactly this. Images → sequences of discrete tokens via a Simplicial Embedding (SEM) encoder. We take the argmax….
0
1
0
RT @aliceoh: @ancadianadragan giving a really interesting keynote on developing robots that optimize for what humans want #ICML2025.
0
1
0
RT @_albertgu: I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transfor….
0
115
0
RT @shaulneta: [1/n].New paper alert! 🚀.Excited to introduce 𝐓𝐫𝐚𝐧𝐬𝐢𝐭𝐢𝐨𝐧 𝐌𝐚𝐭𝐜𝐡𝐢𝐧𝐠 (𝐓𝐌)! We're replacing short-timestep kernels from Flow Mat….
0
47
0
RT @benno_krojer: Started a new podcast with @tvergarabrowne !. Behind the Research of AI: .We look behind the scenes, beyond the polished….
0
13
0
RT @svlevine: If you have a policy that uses diffusion/flow (e.g. diffusion VLA), you can run RL where the actor chooses the noise, which i….
0
157
0