Saba_A96 Profile Banner
Saba Profile
Saba

@Saba_A96

Followers
133
Following
575
Media
8
Statuses
141

MSc @Mila_Quebec and @UMontrealDIRO

Joined November 2020
Don't wanna be here? Send us removal request.
@kylelostat
Kyle Lo
4 days
why intern at Ai2? ๐ŸŸinterns own major parts of our model development, sometimes even leading whole projects ๐Ÿกwe're committed to open science & actively help our interns publish their work reach out if u wanna build open language models together ๐Ÿค links๐Ÿ‘‡
14
45
697
@a_kazemnejad
Amirhossein Kazemnejad
6 days
After nearly 3 years since our NeurIPS paper, SOTA architectures are now adopting NoPE. Kimi Linear uses NoPE for all full-attention layers (not a RoPE hybrid).
@rohanpaul_ai
Rohan Paul
8 days
The brilliant Kimi Linear paper. It's a hybrid attention that beats full attention while cutting memory by up to 75% and keeping 1M token decoding up to 6x faster. It cuts the key value cache by up to 75% and delivers up to 6x faster decoding at 1M context. Full attention is
7
34
371
@rohanpaul_ai
Rohan Paul
8 days
Stanford just published a huge 470-page study ๐Ÿ“• "The Principles of Diffusion Models" Explains how diffusion models turn noise into data and ties their main ideas together. It starts from a forward process that adds noise over time, then learns the exact reverse. The reverse
13
184
1K
@Saba_A96
Saba
19 days
Delighted to share that my supervisor @aagrawalAA has been awarded the 2025 Mark Everingham Prize, one of the most prestigious honors in the field! Looking forward to seeing her work continue to inspire. ๐Ÿ’ซ๐ŸŽ‰
@aagrawalAA
Aishwarya Agrawal
19 days
I am quite excited to share that our efforts in organizing and running "The VQA series of challenges" have been recognized with the 2025 Mark Everingham Prize -- https://t.co/oZrGEBBU5B for "stimulating a new strand of vision and language research". Thank you to the PAMI TC
0
0
17
@oscmansan
Oscar Maรฑas @ ICCV
22 days
๐ŸŒบ Attending @ICCVConference in Honolulu this week! I'll be presenting our work on multimodal reward-guided decoding. Come check it out on October 21 (morning), poster #122. If youโ€™re around, Iโ€™d love to connect and chat about multimodal models and real-time video generation!
@oscmansan
Oscar Maรฑas @ ICCV
3 months
Iโ€™m happy to share that our paper "Controlling Multimodal LLMs via Reward-guided Decoding" has been accepted to #ICCV2025! ๐ŸŽ‰ w/ @proceduralia, @koustuvsinha, @adri_romsor, @michal_drozdzal, and @aagrawalAA ๐Ÿ”— Read more: https://t.co/wIRL9jsAr1 ๐Ÿงต Here's what we did:
0
6
20
@aagrawalAA
Aishwarya Agrawal
21 days
I will be speaking about "Reasoning, data-efficiency and alignment in vision-language models" at the CLVL workshop tomorrow (Oct 20) at ICCV @ 9.15am! So stop-by if you are interested in these topics, or just want to learn about what my lab is up-to! https://t.co/4DrufmJ03Z
@moElhoseiny
Mohamed Elhoseiny
23 days
๐ŸŽ‰ CLVL 2025: Celebrating a Decade of Vision & Language Innovation! ๐ŸŽ‰ Join us for a reflection on a remarkable decade-long journey for the CLVL workshop series with an amazing set of speakers!
2
5
34
@Saba_A96
Saba
25 days
๐Ÿš€ Exciting opportunity to work with @RajeswarSai on cutting-edge research in video modeling and multimodal reasoning! ๐ŸŽ“ Heโ€™s recruiting grad students. Donโ€™t miss it!
@RajeswarSai
Sai Rajeswar
25 days
Iโ€™m looking forward to co-supervising students in the upcoming academic year at Mila. There is much to explore in the space of action-conditioned video modeling and long-context multimodal reasoning. We are advancing & if this aligns with your interests, please apply ๐Ÿ‘‡
0
0
4
@MAghajohari
Milad Aghajohari
1 month
Introducing linear scaling of reasoning: ๐“๐ก๐ž ๐Œ๐š๐ซ๐ค๐จ๐ฏ๐ข๐š๐ง ๐“๐ก๐ข๐ง๐ค๐ž๐ซ Reformulate RL so thinking scales ๐Ž(๐ง) ๐œ๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ž, not O(n^2), with O(1) ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐Ÿงต
14
202
918
@a_kazemnejad
Amirhossein Kazemnejad
1 month
Itโ€™s clear next-gen reasoning LLMs will run for millions of tokens. RL at 1M needs ~100ร— compute than 128K. Our Markovian Thinking keeps compute scaling linear instead. Check out Miladโ€™s thread; some of my perspectives below:
@MAghajohari
Milad Aghajohari
1 month
Introducing linear scaling of reasoning: ๐“๐ก๐ž ๐Œ๐š๐ซ๐ค๐จ๐ฏ๐ข๐š๐ง ๐“๐ก๐ข๐ง๐ค๐ž๐ซ Reformulate RL so thinking scales ๐Ž(๐ง) ๐œ๐จ๐ฆ๐ฉ๐ฎ๐ญ๐ž, not O(n^2), with O(1) ๐ฆ๐ž๐ฆ๐จ๐ซ๐ฒ, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy ๐Ÿงต
18
94
898
@Saba_A96
Saba
2 months
Exciting news! ๐Ÿš€ Our work on "The Promise of RL for Autoregressive Image Editing" has been accepted at NeurIPS 2025! ๐ŸŽ‰ ๐Ÿ”ฅ ๐—˜๐—”๐—ฅ๐—Ÿ: A simple, scalable RL pipeline for high-quality, controllable edits. Check out the project on GitHub: [ https://t.co/KpaXflG5uC]
Tweet card summary image
github.com
EARL: Editing with Autoregression and RL. Contribute to mair-lab/EARL development by creating an account on GitHub.
@Saba_A96
Saba
3 months
We built a new ๐—ฎ๐˜‚๐˜๐—ผ๐—ฟ๐—ฒ๐—ด๐—ฟ๐—ฒ๐˜€๐˜€๐—ถ๐˜ƒ๐—ฒ + ๐—ฅ๐—Ÿ image editing model using a strong verifier โ€” and it beats SOTA diffusion baselines using 5ร— less data. ๐Ÿ”ฅ ๐—˜๐—”๐—ฅ๐—Ÿ: a simple, scalable RL pipeline for high-quality, controllable edits. ๐Ÿงต1/
0
7
18
@_rabiulawal
Rabiul Awal
2 months
๐ŸšจExciting news! Our paper โ€œWebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generationโ€ is accepted for an oral presentation at EMNLP 2025! ๐ŸŽ‰ WebMMU addresses a critical gap in AI evaluation: how well can models understand and build websites? ๐Ÿงต1/n
2
18
24
@worldmodel_26
World Modeling Workshop 2026
2 months
๐ŸšจAnnouncing the World Modeling Workshop 2026 ๐Ÿšจ ๐Ÿ“… When: Feb 4โ€“6, 2026 ๐Ÿ“Where: Mila (Montrรฉal) + Online (free) ๐Ÿ’ก What: Keynotes, Methods Deep Dive, and Tutorials ๐ŸŒ https://t.co/WukFtNON3o โœ‰๏ธ worldmodel.mila@gmail.com ๐Ÿงต Details below:
6
57
242
@gspandana
Spandana Gella
2 months
Internship @ServiceNowRSRCH to build the next generation of computer use agents that are safe and secure from malicious attacks. Focus on intervention strategies, defenses to make agents robust against unsafe behavior.. Apply here:
0
29
35
@PShravannayak
P Shravan Nayak
3 months
A Hindu wedding without a sacred fire? A Chinese banquet with forks? Do text-to-image models meet cultural expectations, both explicitly stated and implicitly assumed? Excited to share our latest paper on evaluating cultural alignment in T2I models ๐ŸŒ https://t.co/UCcaWGtqNG
1
23
57
@Saba_A96
Saba
3 months
7/ Huge thanks to the amazing team and everyone who supported us along the way. Grateful for all the collaboration and effort! @_rabiulawal @sikarwar_ank @a_kazemnejad @OOOOLGAluo @joanrod_ai @RajeswarSai @sivareddyg @chrisjpal @benno_krojer @aagrawalAA
1
1
10
@Saba_A96
Saba
3 months
6/ Although diffusion-based approaches were previously seen as the dominant method for image editing, RL on AR models boosts performance, making them competitive with diffusion models while being more data-efficient. EARL shows AR+RL is a promising combination for image editing.
1
1
8
@Saba_A96
Saba
3 months
5/ Moreover, we conduct the first systematic analysis of SFT vs RL for image editing, showing RL post-training excels without paired data for complex edits (counting, spatial, and action changes). SFT alone is insufficient due to the lack of high-quality paired datasets.
1
1
8
@Saba_A96
Saba
3 months
4/ We also explored Chain-of-Thought (CoT) reasoning: ๐Ÿง  Add explanations before edits. While CoT during SFT hurt performance (Emu3 isnโ€™t pretrained for reasoning), โ†’ RL still improved some of these weaker reasoning SFT models.
1
1
8
@Saba_A96
Saba
3 months
3/ EARL combines: - Autoregressive generation (discrete text+vision tokens) (Emu3) - GRPO for stable RL - A QWEN-VL-72B verifier for reward No denoising. No complex pipelines. Just a clean RL setup that works.
1
1
8