mciccone_AI Profile Banner
Marco Ciccone Profile
Marco Ciccone

@mciccone_AI

Followers
1K
Following
20K
Media
55
Statuses
1K

Postdoctoral Fellow @VectorInst - Collaborative, Decentralized, Modular ML - Competition chair @NeurIPSConf 2021, 2022, 2023 - PhD @polimi ex @NVIDIA @NNAISENSE

Toronto, Canada
Joined April 2015
Don't wanna be here? Send us removal request.
@mciccone_AI
Marco Ciccone
1 year
🚨 Life update 🚨 I moved to Toronto 🇨🇦and joined @VectorInst as a Postdoctoral Fellow to work with @colinraffel and his lab on collaborative, decentralized, and modular machine learning to democratize ML model development. Exciting times ahead! 🪿
13
3
106
@RickZack96
Riccardo Zaccone @ NeurIPS
9 days
🚀 Excited to be at #NeurIPS2025 this week! I’ll be presenting our work on distributed and federated optimization. You'll find me on 6th Dec: - OPT for ML: 20A 10-11 am - Reliable ML: 2, 1:30 2:15 pm If you're working on learning at scale, come find me at — happy to chat 🤝
0
2
8
@mciccone_AI
Marco Ciccone
8 days
0
0
1
@mciccone_AI
Marco Ciccone
8 days
😮 A fully packed room for our Model Merging tutorial at #NeurIPS2025 yesterday! I hope you are all less perplexed about parameter averaging! Thanks to all our panelists @sarahookr @PontiEdoardo @alexandraxron @margs_li @chargoddard and to all participants!
@sarahookr
Sara Hooker
9 days
A lunch merge.
1
0
9
@mciccone_AI
Marco Ciccone
13 days
🚨 Excited to give a tutorial on Model Merging next week at #NeurIPS2025 in San Diego! Join us on 📅Tue 2 Dec 9.30 am - 1 pm PST
@Malikeh5
Malikeh Ehghaghi
14 days
Excited to be at @NeurIPSConf next week co-presenting our tutorial: "Model Merging: Theory, Practice, and Applications" 🔥 Proud to do this with my PhD advisor Colin Raffel, our research fellow @mciccone_AI, and an incredible panel of speakers 💙 #NeurIPS2025 #ModelMerging
0
0
8
@dmsobol
Daria Soboleva ✈️ NeurIPS
22 days
I am excited to be organizing the 8th scaling workshop at @NeurIPSConf this year! Dec 5-6 | 5-8pm PT | Hard Rock Hotel San Diego Co-organized by @cerebras, @Mila_Quebec, and @mbzuai Register:
Tweet card summary image
luma.com
Come join us after NeurIPS for the 8th Scaling Workshop series that started in Oct 2021! We provide a forum for discussing the challenges and advances in…
2
12
68
@mciccone_AI
Marco Ciccone
1 month
To put things in perspective on how crazy our field is: - Yoshua Bengio has reached 1M citations in almost 40 years of career - The Transformer papers by itself has reached 200K citations in 8 years
0
0
0
@mciccone_AI
Marco Ciccone
2 months
I am at a point where I need a feature like "is it me or is it down" specifically for debugging multi-node and multi-GPU communications with NCCL
0
0
0
@mciccone_AI
Marco Ciccone
2 months
RT @murefil: This was a great group effort ❤️. Check the thread below! My 2c: we train a 32B coding agent by distilling strong teacher mod…
0
1
0
@chhaviyadav_
Chhavi Yadav
2 months
🚀 Federated Learning (FL) promises collaboration without data sharing. While Cross-Device FL is a success and deployed widely in industry, we don’t see Cross-Silo FL (collaboration between organizations) taking off despite huge demand and interest. Why could this be the case? 🤔
1
12
23
@mciccone_AI
Marco Ciccone
2 months
Heading to #COLM2025 in the beautiful Montreal 🍁 Excited to discuss distributed learning and modular approaches for mixing and merging specialized language models - Ping me if you are there!
0
0
4
@mciccone_AI
Marco Ciccone
2 months
It is so refreshing to see such an example of quality over quantity research in academia. Congrats @deepcohen!
@deepcohen
Jeremy Cohen
2 months
@jasondeanlee @SebastienBubeck @tomgoldsteincs @zicokolter @atalwalkar This is the third, last, and best paper from my PhD. By some metrics, an ML PhD student who writes just three conference papers is "unproductive." But I wouldn't have had it any other way 😉 !
0
0
1
@mciccone_AI
Marco Ciccone
4 months
If you want to learn about KFAC this is the best place to start!
@f_dangel
Felix Dangel
4 months
KFAC is everywhere—from optimization to influence functions. While the intuition is simple, implementation is tricky. We (@BalintMucsanyi, @2bys2 ,@runame_) wrote a ground-up intro with code to help you get it right. 📖 https://t.co/sIQfB1bmsE 💻
0
0
0
@mciccone_AI
Marco Ciccone
4 months
💯 This is why we need modular and specialized models instead of generalist ones
@douwekiela
Douwe Kiela
4 months
GPT-5 is the most significant product release in AI history, but not for the reason you might think. What it signals is that we're moving from the "bigger model, better results" era to something much more nuanced. This is a genuine inflection point. The fact that people call a
0
0
1
@mciccone_AI
Marco Ciccone
4 months
If your PhD advisor has a statement like this, you know you have made the right choice. Good job and good luck with your new lab @maksym_andr! "...We are not necessarily interested in getting X papers accepted at NeurIPS/ICML/ICLR. We are interested in making an impact..."
@maksym_andr
Maksym Andriushchenko
4 months
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start
0
1
13
@mciccone_AI
Marco Ciccone
4 months
Despite the impressive output of @gabriberton (super-deserved results), it seems like a good time as any to remind ourselves that PhDs are not about the number of papers and that people should prioritize learning how to conduct research rather than maximizing meaningless metrics.
@gabriberton
Gabriele Berton
5 months
A few numbers from my PhD: 8 first-author top-conference (CVPR/ICCV/ECCV) papers 100% acceptance rate per paper 80% acceptance rate per submission 1 invited long talk at CVPR tutorial 5 top-conf demos (acceptance rate 100% vs ~30% average) ~2k GitHub stars
1
2
34
@mciccone_AI
Marco Ciccone
4 months
Conference networking be like
0
0
1
@mciccone_AI
Marco Ciccone
5 months
Not entirely true - better understanding of optimization issues of neural networks, residual connections, normalization layers… and in insight, imagenet was clearly showing the way that data is all you need
@jxmnop
dr. jack morris
5 months
very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes
0
0
0
@mciccone_AI
Marco Ciccone
5 months
Nice parallel!
@TheyCallMeMr_
Saurabh Dash
5 months
ML labs seem to have quite the parallels with F1 teams. Over the top of my mind:
0
0
1
@mciccone_AI
Marco Ciccone
5 months
Dan’s talk was a masterclass — Go watch the recording. Super clear, packed with results, and genuinely one of the most well-delivered talks I’ve seen in a while.
@danbusbridge
Dan Busbridge
5 months
Happening in 30 minutes in West Ballroom A - looking forward to sharing our work on Distillation Scaling Laws!
1
1
7
@ffabffrasca
Fabrizio Frasca
5 months
6/ @JoshSouthern13 and I be at #ICML2025, poster session Tuesday — stop by and chat if you're around! ... I would also be happy to meet up and chat about graphs, (graphs and) LLMs, and how to detect their hallucinations 😳 Feel free to reach out!
0
1
4