
Nicolas Zucchet
@NicolasZucchet
Followers
496
Following
743
Media
33
Statuses
162
PhD student @CSatETH prev. student researcher @GoogleDeepMind | @Polytechnique
Joined December 2017
RT @oswaldjoh: Super happy and proud to share our novel scalable RNN model - the MesaNet! . This work builds upon beautiful ideas of 𝗹𝗼𝗰𝗮𝗹𝗹….
0
64
0
RT @scychan_brains: Emergence in transformers is a real phenomenon!. Behaviors and capabilities can appear in models in sudden ways. Emerge….
0
42
0
RT @orvieto_antonio: We have a new SSM theory paper, just accepted to COLT, revisiting recall properties of linear RNNs. It's surprising….
0
40
0
RT @AndrewLampinen: Some nice analysis by Nicolas & Francesco of a clear case of emergence — and how to accelerate its acquisition!.
0
3
0
RT @scychan_brains: Smooth predictable scaling laws are central to our conceptions and forecasts about AI -- but lots of capabilities actua….
0
2
0
Huge thanks to my amazing coauthors @dngfra @AndrewLampinen @scychan_brains 🙏. Excited to see where this research on emergence and sparse attention leads. Check out the full paper here:
1
2
17
RT @tyler_m_john: I really like this new op ed from @DavidDuvenaud on how so many different kinds of pressures could drive towards loss of….
0
35
0
RT @AndrewLampinen: How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context le….
0
148
0
RT @orvieto_antonio: This is just a reminder for your NeurIPS experiments: if you are comparing architectures, optimizers, or whatever at a….
0
5
0
RT @sohamde_: Our new paper sheds light on the process of knowledge acquisition in language models, with implications for.- data curricula….
0
6
0
RT @K_Ishi_AI: Google DeepMindより、LLMの知識獲得プロセスを解明した論文が出た。. LLMの学習初期には知識獲得の停滞期(プラトー期)が存在する。. だが実は、この期間に特定の要素に着目し、知識獲得を行う効率的な注意パターンを確立。そして急速な知….
0
257
0
Thanks to my co-authors Jörg Bornschein, @scychan_brains, @AndrewLampinen, Razvan Pascanu, and @sohamde_. I couldn't have dreamed of a better team for this collaboration!. Check out the full paper for all the technical details.
1
0
12