Anand Gopalakrishnan
@agopal42
Followers
309
Following
930
Media
16
Statuses
128
Postdoc at @Harvard with @du_yilun and @gershbrain. PhD with @SchmidhuberAI. Previously: Apple MLR, AWS AI Lab. 7\. Same handle on 🦋
Cambridge, MA
Joined January 2018
Excited to present "Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery" at #NeurIPS2024! TL;DR: Our model, SynCx, greatly simplifies the inductive biases and training procedures of current state-of-the-art synchrony models. Thread 👇 1/x.
2
41
165
We found a new way to get language models to reason. 🤯 No RL, no training, no verifiers, no prompting. ❌ With better sampling, base models can achieve single-shot reasoning on par with (or better than!) GRPO while avoiding its characteristic loss in generation diversity.
68
251
2K
[1/4] As you read words in this text, your brain adjusts fixation durations to facilitate comprehension. Inspired by human reading behavior, we propose a supervised objective that trains an LLM to dynamically determine the number of compute steps for each input token.
4
10
25
Why do video models handle motion so poorly? It might be lack of motion equivariance. Very excited to introduce: Flow Equivariant RNNs (FERNNs), the first sequence models to respect symmetries over time. Paper: https://t.co/dkk43PyQe3 Blog: https://t.co/I1gpam1OL8 1/🧵
8
72
398
Excited to share our new ICML paper, with co-authors @robert_csordas and @SchmidhuberAI! How can we tell if an LLM is actually "thinking" versus just spitting out memorized or trivial text? Can we detect when a model is doing anything interesting? (Thread below👇)
5
52
203
Your language model is wasting half of its layers to just refine probability distributions rather than doing interesting computations. In our paper, we found that the second half of the layers of the Llama 3 models have minimal effect on future computations. 1/6
35
139
1K
In the physical world, almost all information is transmitted through traveling waves -- why should it be any different in your neural network? Super excited to share recent work with the brilliant @mozesjacobs: "Traveling Waves Integrate Spatial Information Through Time" 1/14
147
916
7K
Congratulations to @RichardSSutton and Andy Barto on their Turing award!
29
121
1K
Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! https://t.co/GrDfgzW1fL
34
470
2K
BREAKING: Amii Chief Scientific Advisor, Richard S. Sutton, has been awarded the A.M. Turing Award, the highest honour in computer science, alongside Andrew Barto! Read the official @TheOfficialACM announcement: https://t.co/JXDhdEsQv7
#TuringAward #AI #ReinforcementLearning
5
50
234
Brains, Minds and Machines Summer Course 2025. Application deadline: Mar 24, 2025 https://t.co/rExih3y60h See more information here: https://t.co/UV5hLH6HBp
mbl.edu
The goal of this course is to help produce a community of leaders that is equally knowledgeable in neuroscience, cognitive science, and computer science and will lead the scientific understanding of...
0
21
54
Come visit our poster East Exhibit Hall A-C #3707, today (Thursday) between 4:30-7:30pm to learn about how complex-valued NNs perform perceptual grouping. #NeurIPS2024
Excited to present "Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery" at #NeurIPS2024! TL;DR: Our model, SynCx, greatly simplifies the inductive biases and training procedures of current state-of-the-art synchrony models. Thread 👇 1/x.
0
0
10
Interested in JEPA/visual representation learning for diverse downstream tasks like planning and reasoning? Check out "Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning" at the @NeurIPSConf SSL Workshop on 12/14. Led by @EtaiLittwin (1/n)
2
7
20
Please check out a dozen 2024 conference papers with my awesome students, postdocs, and collaborators: 3 papers at NeurIPS, 5 at ICML, others at CVPR, ICLR, ICRA: 288. R. Csordas, P. Piekos, K. Irie, J. Schmidhuber. SwitchHead: Accelerating Transformers with Mixture-of-Experts
arxiv.org
Despite many recent works on Mixture of Experts (MoEs) for resource-efficient Transformer language models, existing methods mostly focus on MoEs for feedforward layers. Previous attempts at...
10
119
327
Paper: https://t.co/fbjgfBF90Z Code: https://t.co/iZndiQpxbZ Joint work with @aleks_stanic @SchmidhuberAI @mc_mozer Hope to see you all at our poster at #NeurIPS2024! 10/x
github.com
Official code repository for NeurIPS 2024 paper "Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery" - agopal42/syncx
0
0
5
Phase synchronization in SynCx towards objects is more robust compared to baselines. It can successfully separate similarly colored objects, which is a common failure mode of other synchrony models that simply rely on color as a shortcut feature for grouping. 9/x
1
0
0
SynCx outperforms current state-of-the-art unsupervised synchrony-based models on standard multi-object datasets while using between 6-23x fewer model parameters compared to the baseline models. 8/x
1
0
1
Our model does not need additional inductive biases (gating mechanisms), strong supervision (depth masks) and/or contrastive training as used by current state-of-the-art synchrony models to achieve phase synchronization towards objects in a fully unsupervised way. 7/x
1
0
0