
Stefan Horoi
@stefanhoroi
Followers
48
Following
9
Media
7
Statuses
13
PhD student at @UMontreal and @Mila_Quebec, currently working on model merging and representation comparison.
Montréal, Québec
Joined December 2016
🔎Do better expert models always lead to better model merging & MoErging? And how does expert training (duration) affect model upcycling?.We tackle these questions in our recent work: “Less is More: Undertraining Experts Improves Model Upcycling”.🧵1/N.
1
5
9
@gkdziugaite @ebelilov @mrguywolf We thank @Google, @Mila_Quebec, @CRSNG_NSERC, FRQNT and CIFAR for their generous research funding and support!.🧵9/N.
1
0
3
This is joint work with @gkdziugaite, @ebelilov and @mrguywolf!. 📜Read our ArXiv preprint here: Contact us with any questions or comments, or simply drop them below 👇🏻- we’d love to hear your thoughts!.🧵8/N.
arxiv.org
Modern deep learning is increasingly characterized by the use of open-weight foundation models that can be fine-tuned on specialized datasets. This has led to a proliferation of expert models and...
1
0
4
RT @benjamintherien: How do MoE transformers, like DeepSeek, behave under distribution shifts? Do their routers collapse? Can they still ma….
0
20
0
Very excited to present our paper "Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis" at @icmlconf 2024! Come see our poster tomorrow, Wed. July 24th 1h30-3pm. Paper: Code: @Mila_Quebec #ICML2024.
0
7
12
Mes remerciements les plus sincères à la Fondation Schulich, à M. Seymour Schulich et à l'Université de Montréal! #2017SLSquad
0
0
1