ShuvenduBikash Profile Banner
Shuvendu Roy Profile
Shuvendu Roy

@ShuvenduBikash

Followers
94
Following
1K
Media
25
Statuses
1K

Generalizability in AI l Machine Learning Research Intern @RBCBorealis | Ph.D Candidate (AI) @queensu | Former: Student Researcher @google;@VectorInst

Toronto, Canada
Joined December 2014
Don't wanna be here? Send us removal request.
@ShuvenduBikash
Shuvendu Roy
23 hours
RT @omarsar0: A Deep Dive into RL for LLM Reasoning. Provides a roadmap for practitioners applying RL for LLM reasoning. Nice to have some….
0
99
0
@ShuvenduBikash
Shuvendu Roy
6 days
RT @omarsar0: Efficient Agents. This is a great study full of insights on how to build efficient agents. If you are looking to reduce cost….
0
188
0
@ShuvenduBikash
Shuvendu Roy
23 days
RT @TheAITimeline: 🚨This week's top AI/ML research papers:. - Mixture-of-Recursions.- Scaling Laws for Optimal Data Mixtures.- Training Tra….
0
71
0
@ShuvenduBikash
Shuvendu Roy
29 days
RT @omarsar0: One Token to Fool LLM-as-a-Judge. Watch out for this one, devs!. Semantically empty tokens, like “Thought process:”, “Solutio….
0
122
0
@ShuvenduBikash
Shuvendu Roy
1 month
RT @Kimi_Moonshot: 🚀 Hello, Kimi K2! Open-Source Agentic Model!.🔹 1T total / 32B active MoE model.🔹 SOTA on SWE Bench Verified, Tau2 & Ace….
0
1K
0
@ShuvenduBikash
Shuvendu Roy
2 months
RT @aaronmakelky: @LoicReco Link to guide for those who don’t want screen shots:
0
8
0
@ShuvenduBikash
Shuvendu Roy
2 months
RT @omarsar0: Reinforcement Pre-Training. New pre-training paradigm for LLMs just landed on arXiv!. It incentivises effective next-token re….
0
91
0
@ShuvenduBikash
Shuvendu Roy
3 months
RT @xuandongzhao: 🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards"….
0
513
0
@ShuvenduBikash
Shuvendu Roy
3 months
RT @omarsar0: Understanding Reasoning Capabilities in LLMs. This report shares insights and tips for using reasoning models. Great read fo….
0
199
0
@ShuvenduBikash
Shuvendu Roy
4 months
RT @xhluca: DeepSeek-R1 Thoughtology: Let’s <think> about LLM reasoning. 142-page report diving into the reasoning chains of R1. It spans 9….
0
137
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @finbarrtimbers: a paper I like a lot is the "Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback"….
0
31
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @iScienceLuvr: Interesting, Microsoft recently published a short paper demonstrating RLVR (R1-style training) on medical QA datasets.….
0
44
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @iScienceLuvr: Scaling Laws of Synthetic Data for Language Models. "In this work, we systematically investigate the scaling laws of synt….
0
66
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @y0b1byte: Best explanation of what's going on under the hood in the v3/r1 release.
Tweet media one
0
110
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @gm8xx8: DAPO: An Open-Source LLM Reinforcement Learning System at Scale. DAPO is a reinforcement learning algorithm for large-scale LLM….
0
55
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @cwolferesearch: Many recent frontier LLMs like Grok-3 and DeepSeek-R1 use a Mixture-of-Experts (MoE) architecture. To understand how it….
0
114
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @TXhunyuan: 🚀 Introducing Hunyuan-TurboS – the first ultra-large Hybrid-Transformer-Mamba MoE model!.Traditional pure Transformer models….
0
222
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @_akhaliq: Alibaba just dropped R1-Omni. Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning .
0
222
0
@ShuvenduBikash
Shuvendu Roy
5 months
RT @_akhaliq: Vision-R1. Incentivizing Reasoning Capability in Multimodal Large Language Models
Tweet media one
0
88
0