Mehmet Onurcan Kaya
@monurcan55
Followers
7
Following
15
Media
0
Statuses
9
PhD Student @DTU_Compute | Prev. MSc/BSc @metu_eee
Copenhagen
Joined September 2025
Our paper ImageChain (with @IngoZiegler & @delliott) was accepted at #WACV2026! We explore how multimodal LLMs reason over sequences of images I’ll present it at the @_LXAI Workshop @NeurIPS 🇲🇽 (Nov 30 ~10:45 Mex City). Come chat if you’re there! 🫶 📄 https://t.co/iQdaNZcDsn
0
8
19
Add tokens to an LLM without retraining the whole model. We introduce Token Distillation: attention-aware input embeddings for new tokens that match the model’s original behavior. How does it work? Check out the thread!
1
5
17
@IngoZiegler will present a synthetic data generation framework that rewrites real retrieved documents into task-specific finetuning examples. CRAFT is more stable than existing techniques like SelfInstruct and EvolInstruct across several tasks. Paper: https://t.co/8CBshW6wwv
1
3
5
@ilker_kesen will present PIXEL-M4, a multilingually pretrained PIXEL model that outperforms previous monolingual models. M4 uses the same architecture and pretraining setup as previous work, but multi-script pretraining improves performance. Paper: https://t.co/bim6SIRmjF
0
2
8
Looking forward to talking about Efficient Test-Time Scaling for Small Vision-Language Models at the University of Waterloo in the Davis Center 3301 at 4:30pm today. This is joint work with @monurcan55 and @dim_p_papa
https://t.co/Zx0YJ3HAtY
0
8
31
Riise et al., "Visual Autoregressive Models Beat Diffusion Models on Inference Time Scaling" Beam search with Autoregressive image generators with verifiers.
1
17
157
Efficient Test-Time Scaling for Small Vision-Language Models 🌐 Project Page: https://t.co/uQMnoUuIN2 📄 Paper: https://t.co/mCIKM0ymQr 💻 Code:
0
0
3
✨ Test-Time Augmentation: token-level aggregation of augmented inputs ⚡ Test-Time Adaptation: lightweight adaptation via pseudolabels ✅ Consistent gains on 9 benchmarks ✅ Runs on consumer GPUs ➡️ Resource-efficient, practical, and generalizable
1
0
3
Excited to share our latest work!🎉 We propose an efficient test-time scaling framework for small vision-language models via Test-Time Augmentation and Test-Time Adaptation. https://t.co/uQMnoUvgCA
@delliott @dim_p_papa
#TestTimeScaling #VLM #Multimodal #LLM #DeepLearning #AI
monurcan.github.io
We propose two efficient and effective methods improving multimodal small language models at test-time: TTAug (input augmentation + token-level aggregation) and TTAdapt (parameter adaptation via...
1
2
6