Siva Reddy
@sivareddyg
Followers
6K
Following
8K
Media
119
Statuses
2K
Assistant Professor @Mila_Quebec @McGillU @ServiceNowRSRCH; Postdoc @StanfordNLP; PhD @EdinburghNLP; Natural Language Processor #NLProc
Montreal, QC, Canada
Joined July 2009
1/5 🚀Apriel-1.6-15B-Thinker: a 15B multimodal reasoner scoring 57 on the Artificial Analysis Intelligence Index - approaching the performance of ~200B-scale frontier models while remaining an order of magnitude smaller. 🧠Model weights: https://t.co/GE22SOIBfT 📄Blog:
9
53
213
We are beyond thrilled to share our first flagship models, Rnj-1 base and instruct 8B parameter models. Rnj-1 is the culmination of 10 months of hard work by a phenomenal team, dedicated to advancing American SOTA OSS AI. Lots of wins with Rnj-1. 1. SWE bench performance close
Today, we’re excited to introduce Rnj-1, @essential_ai's first open model; a world-class 8B base + instruct pair, built with scientific rigor, intentional design, and a belief that the advancement and equitable distribution of AI depend on building in the open. We bring
101
171
2K
3" perfectly straight hole into a concrete foundation. Just align the circles and go. Easy. Featuring: +BullseyeBore Core CG1-101 +Milwaukee 1/2" Hammer Drill/Driver +Diablo 1/4" Red Granite Plus Concrete Bit Use code FREESHIP at checkout for free standard US shipping.
0
2
66
Introducing WebArena Verified — an audit of all 812 tasks with robust, offline, stack-agnostic eval, https://t.co/j0rJf1K7wL Noise 🚮 → stronger agents 📈, weaker 📉, verbose ones 📈 with JSON format. New: 📦 ~70% leaner Docker envs 🔥 Hard subset (258) for fast/focused evals
4
11
58
Thrilled to share that @annadgoldie and I are launching @RicursiveAI, a frontier lab enabling recursive self-improvement through AIs that design their own chips. Our vision for transforming chip design began with AlphaChip, an AI for layout optimization used to design four
wsj.com
Founded by ex-Google researchers, the company raised $35 million with backing from Sequoia to automate chip design.
Introducing Ricursive Intelligence, a frontier AI lab enabling a recursive self-improvement loop between AI and the chips that fuel it. Learn more at https://t.co/cSpbrQwwEn
123
137
1K
Excited to attend my first NeurIPS and present my work on multilingual routing in MoEs at @WiMLworkshop! If you’re interested in MoEs or multilinguality, I would love to chat. Feel free to DM!
Mila was proud to sponsor the 2025 @WiMLworkshop held on December 2 during the @NeurIPSConf in San Diego. As part of this initiative, we have provided a scholarship enabling four Mila PhD candidates to attend the Conference as well as the WiML workshop. This support aims to
0
2
16
Now you run Hollywood. Show them how it's done. Wishlist 'Hollywood Animal' on Steam NOW! https://t.co/aGUcTYFd1y
14
36
476
LLM as a judge has become a dominant way to evaluate how good a model is at solving a task, since it works without a test set and handles cases where answers are not unique. But despite how widely this is used, almost all reported results are highly biased. Excited to share our
45
176
1K
Is there such a thing as too many agents in multi-agent systems? It depends! 🧵 Our work reveals 3 distinct regimes where communication patterns differ dramatically. More on our findings below 👇 (1/7)
1
11
29
🚀Introducing TMLR Beyond PDF! 🎬This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images. 🎉Thanks to TMLR Editors in Chief @hugo_larochelle @thegautamkamath @NailaMurray Nihar B. Shah @lcharlin!
11
39
200
Life update: I moved to silicon valley to tackle agents' biggest challenges: plasticity and reliability. Today's agents are smart but brittle. They lack plasticity (continual learning and adaptation) and reliability (stable, predictable behavior with bounded failures). These two
40
43
421
I just published "AI Language Technologies are Powerful—But Not Without Limits" https://t.co/81ifJ4gnnK via @CUPAcademic
cambridgeblog.org
Imagine waking up in the morning. You read your emails with the morning coffee and use Gmail’s autocomplete feature to compile the answers. Before leaving the house, you ask Siri for the weather...
0
4
17
🚨 Coming up - Research Connections event on Wednesday, November 26th! Are you interested in building interpretable, trustworthy language models and user-centric AI? Or learning about the biases, safety and cultural alignment of language models? Come meet @lasha_nlp and
2
12
43
She waited two hours for a word. God told her she already had it. Listen to this lesson about hearing God. It could change your life.
0
62
657
We’re hiring PhD students and postdocs on LLM theory and interpretability! Topics: 1️⃣ abilities & limitations of transformers and other architectures; 2️⃣ LLM interpretability; 3️⃣ foundations of LLM reasoning; 4️⃣ foundations of AI safety.
13
93
623
Will be heading to @NeurIPSConf. If you'll be around and interested in advancing multimodal reasoning, RL environments, or vent about ICLR reviews, let's connect☕ 🧩𝗔𝗹𝗶𝗴𝗻𝗩𝗟𝗠: https://t.co/1SGCelYjEO 🎨𝗥𝗟𝗥𝗙: https://t.co/9CpBVRbzfn 🖼️𝗘𝗔𝗥𝗟: https://t.co/n9lA4N2Xbq
3
5
13
🚀 Introducing Apriel-H1: a family of seven 15B hybrid model (Transformer + Mamba) distilled directly from Apriel-Nemotron-15B-Thinker reasoner. ✅ Navigating throughput performance tradeoff with up to 3.4x speedup ✅ 2x speedup without performance loss ✅ Efficient distillation
5
35
110
Introducing Olmo 3 and our entire model flow to build Olmo 3-Think and Olmo3-Instruct. Strong results, big improvements. Massive shoutout to the team who made it happen. Lots of exciting new things come with this release:
Announcing Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use, and an open model flow—not just the final weights, but the entire training journey. Best fully open 32B reasoning model & best 32B base model. 🧵
6
12
116
@niloofar_mire @sivareddyg @seraphinagt @NishantBalepur Okay, reupload here: https://t.co/7pcRXMu0u3 I learned about Premiere's audio normalization, which I should have known about before, thanks for the impetus!
0
1
7
Amazing test of Gemini 3’s multimodal reasoning capabilities: try generating a threejs voxel art scene using only an image as input Prompt: I have provided an image. Code a beautiful voxel art scene inspired by this image. Write threejs code as a single-page
82
258
3K
The Pearl Bitcoin fund allows accredited U.S. investors & corporations to leverage the full potential of Bitcoin while eliminating all capital gains taxes after a 10-year hold via our SEC compliant, institutional grade, proprietary IRS approved process.
0
51
568
Introducing Yutori Navigator 31 years ago, the modern web era began with Netscape Navigator. Today, we’re introducing Yutori Navigator — a web agent that autonomously navigates websites on its own cloud browser to complete tasks for you. Navigator achieves pareto-domination
28
47
247
Jieyu Zhao (@jieyuzhao11) on Personalized AI Agents CUA agents -- rely on grounding actions -- many issues CoAct -- mixing GUI action and coding -- orchestrator has access to coding and GUI operator -- better than just GUI or coding models Discovering knowledge deficiencies
Checkout the IVADO workshop on Deploying Autonomous Agents: Lessons, Risks and Real-World Impact happening today until Wednesday in Montreal with an exciting line up of speakers #Agents #LLMs
https://t.co/VTEOv2kLGO
0
2
11
I had the great pleasure today to speak at IVADO workshop on "Deploying Autonomous Agents: Lessons, Risks and Real-World Impact" in Montreal 🍁🇨🇦 along a brilliant lineup of speakers. A big thanks to the organizers! #LLMs #Agents #safety #Security
Nouha Dziri (@nouhadziri) on LLM to Agent Safety Capability doesn't mean increased safety Capable models still seem to be poor at OOD generalization, so easy to bypass safety WildTeaming -- large scale jailbreaking using several tactics -- 262K jailbreaking examples -- Training
0
6
21