Nanda H Krishna @nandahkrishna X Profile

Nanda H Krishna

@nandahkrishna

Followers

507

Following

161

Media

7

Statuses

41

PhD student at @Mila_Quebec & @UMontreal, @MacHomebrew maintainer.

https://t.co/odQL4uTaNY

Chennai & Montréal

Joined May 2016

Don't wanna be here? Send us removal request.

Nanda H Krishna

@nandahkrishna

5 months

New preprint! 🧠🤖 How do we build neural decoders that are: ⚡️ fast enough for real-time use 🎯 accurate across diverse tasks 🌍 generalizable to new sessions, subjects, and species? We present POSSM, a hybrid SSM architecture that optimizes for all three of these axes! 🧵1/7

4

26

61

Divyat Mahajan

@divyat09

10 days

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned

10

46

215

Mila - Institut québécois d'IA

@Mila_Quebec

24 days

Mila's annual supervision request process is now open to receive MSc and PhD applications for Fall 2026 admission! For more information, visit https://t.co/r01eLcY1P4

3

64

120

Vineet Jain

@thevineetjain

1 month

Qwen3-4B can match DeepSeek-R1 and o3-mini (high) with ONLY test-time scaling?🤯 Introducing Recursive Self-Aggregation (RSA), a new test-time scaling method: - parallel + sequential✅ - no verifiers✅ - no scaffolding✅ Then we use aggregation-aware RL to push further!🚀 🧵👇

1

11

35

Avery Ryoo

@averyryoo

2 months

Two exciting updates 🚀 1️⃣ POSSM has been accepted to NeurIPS 2025! We'll see you in San Diego 🏖️! 2️⃣ I've officially started my PhD! Very grateful to stay at Mila, and excited to continue working on advancing both deep learning + science! 🧪🧬🧠

Avery Ryoo

@averyryoo

5 months

Super stoked to share my first first-author paper that introduces a hybrid architecture approach for real-time neural decoding. It's been a lot of work, but happy to showcase some very cool results!

1

26

Aniket Didolkar

@Aniket_d98

2 months

🚨Reasoning LLMs are e̵f̵f̵e̵c̵t̵i̵v̵e̵ ̵y̵e̵t̵ inefficient! Large language models (LLMs) now solve multi-step problems by emitting extended chains of thought. During the process, they often re-derive the same intermediate steps across problems, inflating token usage and

4

35

210

Mehdi Azabou

@mehdiazabou

2 months

🚨 The call for demos is still open, the deadline is tomorrow! If you have a tool for visualizing large-scale data, pipelines for training foundation models, or BCI demos, we want to see it! Submission is only 500 words, and it's a great opportunity to showcase your work.

Mehdi Azabou

@mehdiazabou

4 months

Excited to announce the Foundation Models for the Brain and Body workshop at #NeurIPS2025!🧠 We invite short papers or interactive demos on AI for neural, physiological or behavioral data. Submit by Aug 22 👉 https://t.co/t77lrS2by5

1

3

8

Mehdi Azabou

@mehdiazabou

3 months

🚨 We are extending the paper submission deadline to Friday, August 29, 11:59 pm AoE. Check our website for the latest updates on the Foundation Models for the Brain and Body workshop #NeurIPS2025 #BrainBodyFM

Mehdi Azabou

@mehdiazabou

4 months

Excited to announce the Foundation Models for the Brain and Body workshop at #NeurIPS2025!🧠 We invite short papers or interactive demos on AI for neural, physiological or behavioral data. Submit by Aug 22 👉 https://t.co/t77lrS2by5

1

3

15

Nanda H Krishna

@nandahkrishna

4 months

Excited to be organising the BrainBodyFM Workshop – in spirit, a successor to our #COSYNE Workshop on Neuro-foundation Models – at #NeurIPS2025! Check out the website for more details. 🧠🤖

Mehdi Azabou

@mehdiazabou

4 months

Excited to announce the Foundation Models for the Brain and Body workshop at #NeurIPS2025!🧠 We invite short papers or interactive demos on AI for neural, physiological or behavioral data. Submit by Aug 22 👉 https://t.co/t77lrS2by5

0

3

24

Vineet Jain

@thevineetjain

4 months

How to align your diffusion model with unseen objectives at inference time? Presenting Diffusion Tree Sampling/Search (DTS/DTS*) 🥳 Using MCTS-style search, DTS steadily improves sample quality with compute, matching the best baseline with 5× less compute!

3

28

153

Nanda H Krishna

@nandahkrishna

5 months

Stay tuned for the project page and code, coming soon! Link: https://t.co/wD39E5Wk53 A big thank you to my co-authors: @averyryoo*, @XimengMao*, @mehdiazabou, @evadyer, @mattperich, and @g_lajoie_! 🧵7/7

arxiv.org

Real-time decoding of neural activity is central to neuroscience and neurotechnology applications, from closed-loop experiments to brain-computer interfaces, where models are subject to strict...

1

2

5

Nanda H Krishna

@nandahkrishna

5 months

Finally, we show POSSM's performance on speech decoding – a long context task that can grow expensive for Transformers. In the unidirectional setting, POSSM beats the GRU baseline, achieving a phoneme error rate of 27.3 while being more robust to variation in preprocessing. 🧵6/7

1

0

2

Nanda H Krishna

@nandahkrishna

5 months

Cross-species transfer! 🐵➡️🧑 We find that POSSM pretrained solely on NHP reaching data achieves SOTA when decoding imagined handwriting in human subjects! This shows the potential of leveraging NHP data to bootstrap human BCI decoding in low-data clinical settings. 🧵5/7

2

0

2

Nanda H Krishna

@nandahkrishna

5 months

By pretraining on 140 monkey reaching sessions, POSSM effectively transfers to new subjects and tasks, matching or outperforming several baselines (e.g., GRU, POYO, Mamba) across sessions. ✅ High R² across the board ✅ 9× faster inference than Transformers ✅ <5ms latency 🧵4/7

1

0

1

Nanda H Krishna

@nandahkrishna

5 months

POSSM combines the real-time inference of an RNN with the tokenization, pretraining, and finetuning abilities of a Transformer! Using POYO-style tokenization, we encode spikes in 50ms windows and stream them to a recurrent model (e.g., Mamba, GRU) for rapid predictions. 🧵3/7

1

0

1

Nanda H Krishna

@nandahkrishna

5 months

The problem with existing decoders? 😔 RNNs are efficient, but rely on rigid, binned input formats – limiting generalization to new neurons or sessions. 😔 Transformers enable generalization via tokenization, but have high computational costs due to the attention mechanism. 🧵2/7

2

1

3

Dane Malenfant

@dvnxmvl_hdf5

5 months

Preprint Alert 🚀 Multi-agent reinforcement learning (MARL) often assumes that agents know when other agents cooperate with them. But for humans, this isn’t always true. Example, plains indigenous groups used to leave resources for others to use at effigies called Manitokan. 1/8

2

12

34

Mehdi Azabou

@mehdiazabou

7 months

The recordings from the 🌐🧠 Neuro Foundation Model workshop are up on the workshop website! Thanks again to our speakers, and everyone who attended. And thanks to the entire team @cole_hurwitz, @nandahkrishna, @averyryoo, @evadyer and @tyrell_turing for making this happen 🙌

Mehdi Azabou

@mehdiazabou

8 months

How can large-scale models + datasets revolutionize neuroscience 🧠🤖🌐? We are excited to announce our workshop: “Building a foundation model for the brain: datasets, theory, and models” at @CosyneMeeting #COSYNE2025. Join us in Mont-Tremblant, Canada from March 31 - April 1.

1

12

58

Nanda H Krishna

@nandahkrishna

7 months

Really enjoyed TAing for this!

Il Memming Park

@memming

7 months

#COSYNE2025 tutorial by Eva Dyer. Foundations of Transformers in Neuroscience https://t.co/reMkULre8g Materials:

0

3

Avery Ryoo

@averyryoo

8 months

Just a couple days until Cosyne - stop by [3-083] this Saturday and say hi! @nandahkrishna @XimengMao

0

1

4