Firat Oncel
@firatoncell
Followers
8
Following
1
Media
1
Statuses
4
human being, cs phd student @concordia @Mila_Quebec, prev @ituvisionlab @itu1773
Montreal
Joined October 2024
@beyzaermis @mirco_ravanelli @CemSubakan @CgtyYldz For different model families and model sizes we observed similar trend.
0
0
0
@beyzaermis @mirco_ravanelli @CemSubakan @CgtyYldz We conducted our experiments with GPT2 model family, OLMo-1B and LLaMA-7b models. We found out that additional pretraining does not always help and you can get help from dataset similarity measures we introduce.
1
0
0
1
1
1
Excited to share my first PhD paper! Why does additional pretraining sometimes hurt rather than help? In this paper we explore the additional adaptation characteristics of LLMs. We dive deep into the impact of pretraining data similarity and offer insights for future adaptations.
1
0
1