Marco Ciccone
@mciccone_AI
Followers
1K
Following
20K
Media
55
Statuses
1K
Postdoctoral Fellow @VectorInst - Collaborative, Decentralized, Modular ML - Competition chair @NeurIPSConf 2021, 2022, 2023 - PhD @polimi ex @NVIDIA @NNAISENSE
Toronto, Canada
Joined April 2015
🚨 Life update 🚨 I moved to Toronto 🇨🇦and joined @VectorInst as a Postdoctoral Fellow to work with @colinraffel and his lab on collaborative, decentralized, and modular machine learning to democratize ML model development. Exciting times ahead! 🪿
13
3
106
🚀 Excited to be at #NeurIPS2025 this week! I’ll be presenting our work on distributed and federated optimization. You'll find me on 6th Dec: - OPT for ML: 20A 10-11 am - Reliable ML: 2, 1:30 2:15 pm If you're working on learning at scale, come find me at — happy to chat 🤝
0
2
8
😮 A fully packed room for our Model Merging tutorial at #NeurIPS2025 yesterday! I hope you are all less perplexed about parameter averaging! Thanks to all our panelists @sarahookr @PontiEdoardo @alexandraxron @margs_li @chargoddard and to all participants!
1
0
9
🚨 Excited to give a tutorial on Model Merging next week at #NeurIPS2025 in San Diego! Join us on 📅Tue 2 Dec 9.30 am - 1 pm PST
Excited to be at @NeurIPSConf next week co-presenting our tutorial: "Model Merging: Theory, Practice, and Applications" 🔥 Proud to do this with my PhD advisor Colin Raffel, our research fellow @mciccone_AI, and an incredible panel of speakers 💙 #NeurIPS2025 #ModelMerging
0
0
8
I am excited to be organizing the 8th scaling workshop at @NeurIPSConf this year! Dec 5-6 | 5-8pm PT | Hard Rock Hotel San Diego Co-organized by @cerebras, @Mila_Quebec, and @mbzuai Register:
luma.com
Come join us after NeurIPS for the 8th Scaling Workshop series that started in Oct 2021! We provide a forum for discussing the challenges and advances in…
2
12
68
To put things in perspective on how crazy our field is: - Yoshua Bengio has reached 1M citations in almost 40 years of career - The Transformer papers by itself has reached 200K citations in 8 years
0
0
0
I am at a point where I need a feature like "is it me or is it down" specifically for debugging multi-node and multi-GPU communications with NCCL
0
0
0
RT @murefil: This was a great group effort ❤️. Check the thread below! My 2c: we train a 32B coding agent by distilling strong teacher mod…
0
1
0
🚀 Federated Learning (FL) promises collaboration without data sharing. While Cross-Device FL is a success and deployed widely in industry, we don’t see Cross-Silo FL (collaboration between organizations) taking off despite huge demand and interest. Why could this be the case? 🤔
1
12
23
Heading to #COLM2025 in the beautiful Montreal 🍁 Excited to discuss distributed learning and modular approaches for mixing and merging specialized language models - Ping me if you are there!
0
0
4
It is so refreshing to see such an example of quality over quantity research in academia. Congrats @deepcohen!
@jasondeanlee @SebastienBubeck @tomgoldsteincs @zicokolter @atalwalkar This is the third, last, and best paper from my PhD. By some metrics, an ML PhD student who writes just three conference papers is "unproductive." But I wouldn't have had it any other way 😉 !
0
0
1
If you want to learn about KFAC this is the best place to start!
KFAC is everywhere—from optimization to influence functions. While the intuition is simple, implementation is tricky. We (@BalintMucsanyi, @2bys2 ,@runame_) wrote a ground-up intro with code to help you get it right. 📖 https://t.co/sIQfB1bmsE 💻
0
0
0
💯 This is why we need modular and specialized models instead of generalist ones
GPT-5 is the most significant product release in AI history, but not for the reason you might think. What it signals is that we're moving from the "bigger model, better results" era to something much more nuanced. This is a genuine inflection point. The fact that people call a
0
0
1
If your PhD advisor has a statement like this, you know you have made the right choice. Good job and good luck with your new lab @maksym_andr! "...We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact..."
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start
0
1
13
Despite the impressive output of @gabriberton (super-deserved results), it seems like a good time as any to remind ourselves that PhDs are not about the number of papers and that people should prioritize learning how to conduct research rather than maximizing meaningless metrics.
A few numbers from my PhD: 8 first-author top-conference (CVPR/ICCV/ECCV) papers 100% acceptance rate per paper 80% acceptance rate per submission 1 invited long talk at CVPR tutorial 5 top-conf demos (acceptance rate 100% vs ~30% average) ~2k GitHub stars
1
2
34
Conference networking be like
0
0
1
Not entirely true - better understanding of optimization issues of neural networks, residual connections, normalization layers… and in insight, imagenet was clearly showing the way that data is all you need
very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes
0
0
0
6/ @JoshSouthern13 and I be at #ICML2025, poster session Tuesday — stop by and chat if you're around! ... I would also be happy to meet up and chat about graphs, (graphs and) LLMs, and how to detect their hallucinations 😳 Feel free to reach out!
0
1
4