
Nishanth Dikkala
@NishanthDikkala
Followers
395
Following
2K
Media
5
Statuses
136
Research Scientist @ Google Research, Ph.D. Computer Science, MIT.
Mountain View, CA
Joined July 2013
RT @GoogleDeepMind: We’re fully releasing Gemma 3n, which brings powerful multimodal AI capabilities to edge devices. 🛠️. Here’s a snapshot….
0
449
0
Presenting this work @ ICLR tomorrow! Come talk to us about looped transformers and their inductive bias for reasoning tasks. Poster #272: Hall 3 + 2B.
*New ICLR paper* – We introduce a paradigm of *looped models for reasoning*. Main claims.- Reasoning requires depth (via looping), not necessarily params.- LLM reasoning predictably scales with more loops.- Looped models generate “latent thoughts” & can simulate CoT reasoning.1/n
1
2
8
RT @kazemi_sm: Checkout our latest Gemma models, you'll love them :).Also checkout the results on our BIG-Bench Extra Hard benchmark: https….
0
4
0
💡🚨Check out our ICLR 2025 paper on the inductive bias looping offers for reasoning tasks!.
*New ICLR paper* – We introduce a paradigm of *looped models for reasoning*. Main claims.- Reasoning requires depth (via looping), not necessarily params.- LLM reasoning predictably scales with more loops.- Looped models generate “latent thoughts” & can simulate CoT reasoning.1/n
0
0
6
As LLMs get stronger, we need harder benchmarks to continue tracking progress. Check out our new work in this direction.
Is BIG-Bench Hard too easy for your LLM?.We just unleashed BIG-Bench EXTRA Hard (BBEH)! 😈.Every task, harder! Every model, humbled! (Poem Credit: Gemini 2.0 Flash).Massive headroom for progress across various areas in general reasoning 🤯
0
2
8
RT @james_s_bedford: Found out students are using this website to have an AI generated Lebron James summarise their study material. You can….
0
27
0
RT @jdeposicion: After apparently the entirety of the Indian population, I've come to also learn that:. *Y'all are wild with names .*Y'all….
0
46
0
RT @kazemi_sm: [1] ReMI: A Dataset for Reasoning with Multiple Images. Work done with @NishanthDikkala, @ankit_s_anand, @hardy_qr, @BahareF….
0
2
0
Check out our new work showing that causal language modeling alone is sufficient for a Transformer model to learn to solve Sudokus and other constraint satisfaction problems like Zebra puzzles! Lead by @shahkulin98, to appear at @NeurIPSConf 2024!.
📢 Excited to announce our new paper (accepted at @NeurIPSConf) that shows that causal language modeling alone can teach a 50M parameter transformer model to solve Sudoku and Zebra puzzles. Paper: A thread 🧵
0
1
12
RT @kiranvodrahalli: Happy to share Michelangelo (, a long-context reasoning benchmark which measures performance b….
0
56
0
RT @francoisfleuret: Funny thing is that I am convinced intelligence is distilling system 1 into system 2.
0
1
0
RT @levelsio: ✨ I made a new site called. 🧳 💨 It's a live ranking of airlines by how much luggage they are losing….
0
813
0
Check out our new multi-image reasoning benchmark where a model needs to reason using information spread across text and multiple images!.(An interesting insight we discover: Even the mightiest models struggle to tell time!).
🚨 Benchmark Alert: Multi-Image Reasoning. While newer LLMs can reason across multiple, potentially disparate, information sources, their effectiveness remains uncertain. We introduce ReMI, a benchmark dedicated to measuring reasoning with multiple images interleaved with text.
0
0
2
Check out the blog post on our work on scaling up embedding dimension in Transformer models efficiently! (NeurIPS 2023 Spotlight paper).(Joint work with Cenk Baykal, Dylan Cutler, Nikhil Ghosh, @rinapy and Xin Wang).
Introducing AltUp, a method that takes advantage of increasing scale in Transformer networks w/out increasing the computation cost — it’s easy to implement, widely applicable to Transformer architectures, and requires minimal hyperparameter tuning. More →
0
1
12