
Yusen Zhang
@YusenZhangNLP
Followers
412
Following
287
Media
28
Statuses
111
PhD Candidate @PennStateEECS | NLP Lab @NLP_PennState #NLProc | Prev Research Intern @MSFTResearch, @AmazonScience @GoogleAI
State College, PA
Joined November 2022
🚀 How Far Are VLMs from Effective High-Resolution Image Understanding?.👉 We found: Still far. 🆕 Introducing HRScene Benchmark:.📸 25 Real-world Scenes + 🧪 2 Diagnostic NIAH Tests.🏙️ 8 Categories: Daily, Paper, Urban Planning, etc. 🖼️ Resolution: 1,024 × 1,024 ➡️ 35,503 ×
1
5
14
RT @RyoKamoi: Our paper VisOnlyQA has been accepted to @COLM_conf #COLM2025! See you in Montreal🍁.We find that even recent Vision Language….
0
9
0
HRScene got accepted at #ICCV2025!. HRScene is a novel unified benchmark for high-resolution image understanding with 25 scenes and 2 NIAH tests. Home page: (Sorry, EvalAI for submission does not work currently. ). My PhD research began with long text
🚀 How Far Are VLMs from Effective High-Resolution Image Understanding?.👉 We found: Still far. 🆕 Introducing HRScene Benchmark:.📸 25 Real-world Scenes + 🧪 2 Diagnostic NIAH Tests.🏙️ 8 Categories: Daily, Paper, Urban Planning, etc. 🖼️ Resolution: 1,024 × 1,024 ➡️ 35,503 ×
0
3
7
RT @GptMaestro: Vision Language Models display a peculiar blind spot: their ability to process image content declines in a U-shaped pattern….
0
1
0
RT @jackqqwang: NeuroGen: We explored a training-free idea—using prompts to guide large models to generate neural net parameters for downst….
0
1
0
RT @RyoKamoi: 📢 New paper!.FoVer enhances PRMs for step-level verification of LLM reasoning w/o human annotation 🚀.We synthesize training d….
0
25
0
RT @iScienceLuvr: HRScene: How Far Are VLMs from Effective High-Resolution Image Understanding?. "we introduce HRScene, a novel unified ben….
0
11
0
RT @ruizhang_nlp: As Vision Language Models treat images as tokens, high-resolution images create long sequences, similar to long-context c….
0
5
0
RT @vipul_1011: Ever wondered how much you can trust a benchmark?. We did too - so we built SMART to make them smarter!. I will be presenti….
0
7
0
RT @snigdhac25: Want to learn about fairness in summarization? @HaoyuanLi9 will present our work on fairness in multidocument summarization….
0
2
0
RT @ruizhang_nlp: This work is led by my first PhD student Yusen @YusenZhangNLP, who is graduating soon and actively seeking a postdoc posi….
0
5
0
RT @ruizhang_nlp: 🎉Our paper on fairness of multidoc summarization has received an SAC award at NAACL 2025! 🥳 We appreciate the recognition….
0
5
0
I will be at NAACL this week. Welcome to discuss with me if you have any thoughts on this project and all the other research topics!.
🚀 How Far Are VLMs from Effective High-Resolution Image Understanding?.👉 We found: Still far. 🆕 Introducing HRScene Benchmark:.📸 25 Real-world Scenes + 🧪 2 Diagnostic NIAH Tests.🏙️ 8 Categories: Daily, Paper, Urban Planning, etc. 🖼️ Resolution: 1,024 × 1,024 ➡️ 35,503 ×
0
0
3
✍️ Authors:. Yusen Zhang @YusenZhangNLP, Wenliang Zheng, Aashrith Madasu, Peng Shi, Ryo Kamoi @RyoKamoi, Hao Zhou @hao_zhh, Zhuoyang Zou, Shu Zhao, Sarkar Snigdha Sarathi Das @sarkarssdas, Vipul Gupta @vipul_1011, Xiaoxin Lu, Nan Zhang @NanZhangNLP, Ranran Haoran Zhang.
0
0
1