Jaden Park Profile
Jaden Park

@_jadenpark

Followers
84
Following
227
Media
3
Statuses
30

CS Ph.D. student @UWMadison, intern @AdobeResearch; foundation models | prev. intern: @Krafton_AI

Madison, WI
Joined October 2023
Don't wanna be here? Send us removal request.
@_jadenpark
Jaden Park
3 months
Me: memorize past exams ๐Ÿ“š๐Ÿ’ฏ Also me: fail on a slight tweak ๐Ÿคฆโ€โ™‚๏ธ๐Ÿคฆโ€โ™‚๏ธ Turns out, we can use the same method to ๐—ฑ๐—ฒ๐˜๐—ฒ๐—ฐ๐˜ ๐—ฐ๐—ผ๐—ป๐˜๐—ฎ๐—บ๐—ถ๐—ป๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ฉ๐—Ÿ๐— ๐˜€! ๐Ÿงต(1/10) - Project Page: https://t.co/ue1GybD4fm
1
10
27
@_jadenpark
Jaden Park
5 days
Excited to share that our work on detecting data contamination in VLMs has been accepted to #ICLR2026! In v2 of our paper, we add - Detecting contamination with paraphrased data. - Detecting contamination in free-form QA. To learn more: https://t.co/RtybGkLOOU See you in Rio๐Ÿ‡ง๐Ÿ‡ท
@_jadenpark
Jaden Park
3 months
Me: memorize past exams ๐Ÿ“š๐Ÿ’ฏ Also me: fail on a slight tweak ๐Ÿคฆโ€โ™‚๏ธ๐Ÿคฆโ€โ™‚๏ธ Turns out, we can use the same method to ๐—ฑ๐—ฒ๐˜๐—ฒ๐—ฐ๐˜ ๐—ฐ๐—ผ๐—ป๐˜๐—ฎ๐—บ๐—ถ๐—ป๐—ฎ๐˜๐—ฒ๐—ฑ ๐—ฉ๐—Ÿ๐— ๐˜€! ๐Ÿงต(1/10) - Project Page: https://t.co/ue1GybD4fm
0
0
16
@Kangwook_Lee
Kangwook Lee
2 months
LLM as a judge has become a dominant way to evaluate how good a model is at solving a task, since it works without a test set and handles cases where answers are not unique. But despite how widely this is used, almost all reported results are highly biased. Excited to share our
48
177
1K
@_jadenpark
Jaden Park
3 months
This is my first project at @UWMadison, with the following fantastic collaborators : @MuCai7 @fengyao1909 @shangjingbo Soochahn Lee @yong_jae_lee. If you have any questions, feedback, or new ideas, Iโ€™d be more than happy to discuss! ๐Ÿงต(10/10)
0
0
3
@_jadenpark
Jaden Park
3 months
We also perform extensive ablation studies: (1) using real-world counterfactuals instead of synthetic perturbations (2) detecting contamination during pre-training (3) model sizes and much more. If this interests you, please check out our work: https://t.co/qnRKFscTdC ๐Ÿงต(9/10)
Tweet card summary image
arxiv.org
Recent advances in Vision-Language Models (VLMs) have achieved state-of-the-art performance on numerous benchmark tasks. However, the use of internet-scale, often proprietary, pretraining corpora...
1
0
0
@_jadenpark
Jaden Park
3 months
The contaminated models we test were 'adversarially/realistically' contaminated (i.e. for one epoch only!) but we were able to ๐—ฑ๐—ฒ๐˜๐—ฒ๐—ฐ๐˜ ๐—ฎ๐—น๐—น ๐—ฐ๐—ผ๐—ป๐˜๐—ฎ๐—บ๐—ถ๐—ป๐—ฎ๐˜๐—ฒ๐—ฑ ๐—บ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ for varying epochs, training strategy and model type. ๐Ÿงต(8/10)
1
0
0
@_jadenpark
Jaden Park
3 months
Our pipeline leads to questions with similar or easier difficulty (we discuss why in the paper). Models that can truly reason should achieve higher performance. This is the case for clean models. However, all contaminated models show performance drop, as dramatic as -45% ๐Ÿงต(7/10)
1
0
0
@_jadenpark
Jaden Park
3 months
What is our method then? We create a semantically perturbed version of the image-question pair, where the original image composition is kept intact using ControlNet, but modified so that the answer needs to change. ๐Ÿงต(6/10)
1
0
0
@_jadenpark
Jaden Park
3 months
More specifically, simply modifying the questions or the image is not enough and shows inconsistent behavior that even directly contradicts the core assumptions of the approaches. ๐Ÿงต(5/10)
1
0
0
@_jadenpark
Jaden Park
3 months
Existing contamination detection methods were developed for LLMs โ€” then a natural question is, do they work for VLMs? To test this, we utilize VQA benchmarks with strict visual dependence, and verify that all existing algorithms fail to satisfy most of the requirements! ๐Ÿงต(4/10)
1
0
0
@_jadenpark
Jaden Park
3 months
For a detection algorithm to be useful in real-world scenarios: (1) it should work without knowing which models are contaminated (2) it needs to be robust to different training strategies (e.g. LoRA) (3) models that are contaminated more should have stronger signals. ๐Ÿงต(3/10)
1
0
0
@_jadenpark
Jaden Park
3 months
We propose ๐— ๐˜‚๐—น๐˜๐—ถ-๐—บ๐—ผ๐—ฑ๐—ฎ๐—น ๐—ฆ๐—ฒ๐—บ๐—ฎ๐—ป๐˜๐—ถ๐—ฐ ๐—ฃ๐—ฒ๐—ฟ๐˜๐˜‚๐—ฟ๐—ฏ๐—ฎ๐˜๐—ถ๐—ผ๐—ป for detecting data contamination in VLMs. To the best of our knowledge, this is the first detection algorithm that is (1) practical (2) reliable and (3) consistent! ๐Ÿงต(2/10)
1
0
0
@_jadenpark
Jaden Park
3 months
Existing contamination detection methods were developed for LLMs โ€” then a natural question is, do they work for VLMs? To test this, we utilize VQA benchmarks with strict visual dependence, and verify that all existing algorithms fail to satisfy most of the requirements! ๐Ÿงต(4/10)
0
0
0
@_jadenpark
Jaden Park
3 months
For a detection algorithm to be useful in real-world scenarios: (1) it should work without knowing which models are contaminated (2) it needs to be robust to different training strategies (e.g. LoRA) (3) models that are contaminated more should have stronger signals. ๐Ÿงต(3/10)
1
0
0
@_jadenpark
Jaden Park
3 months
We propose ๐— ๐˜‚๐—น๐˜๐—ถ-๐—บ๐—ผ๐—ฑ๐—ฎ๐—น ๐—ฆ๐—ฒ๐—บ๐—ฎ๐—ป๐˜๐—ถ๐—ฐ ๐—ฃ๐—ฒ๐—ฟ๐˜๐˜‚๐—ฟ๐—ฏ๐—ฎ๐˜๐—ถ๐—ผ๐—ป for detecting data contamination in VLMs. To the best of our knowledge, this is the first detection algorithm that is (1) practical (2) reliable and (3) consistent! ๐Ÿงต(2/10)
1
0
0
@jon_ghoh
Jongho Park
3 months
Love this. We also saw benefits of replacing position encodings with NoPE + Mamba in ICL tasks for Mamba-Attn hybrids in
Tweet card summary image
arxiv.org
State-space models (SSMs), such as Mamba (Gu & Dao, 2023), have been proposed as alternatives to Transformer networks in language modeling, by incorporating gating, convolutions, and...
@teortaxesTex
Teortaxesโ–ถ๏ธ (DeepSeek ๆŽจ็‰น๐Ÿ‹้“็ฒ‰ 2023 โ€“ โˆž)
3 months
Very confident decision
0
1
4
@fredsala
Fred Sala
4 months
Super excited to present our new work on hybrid architecture modelsโ€”getting the best of Transformers and SSMs like Mambaโ€”at #COLM2025! Come chat with @nick11roberts at poster session 2 on Tuesday. Thread below! (1)
2
28
70
@huang43602
ZeyiHuang
9 months
๐ŸšจOur new paper: VisualToolAgent (VisTA) ๐Ÿšจ Visual agents learn to use toolsโ€”no prompts or supervision! โœ…RL via GRPO โœ…Decoupled agent/reasoner (e.g. GPT-4o) โœ…Strong OoD generalization ๐Ÿ“ŠChartQA, Geometry3K, BlindTest, MathVerse ๐Ÿ”— https://t.co/HDcnGImOUQ ๐Ÿงต๐Ÿ‘‡
2
5
12
@_jadenpark
Jaden Park
10 months
Excited to share that I will be at @AdobeResearch in San Jose, CA as a Research Scientist Intern under @vdeschaintre @michi_fischer @iliyang and @Krishnakusin ! Looking forward to my first California experience! I would love to connect and catch up with anyone in the area :)
1
0
6
@ErnestRyu
Ernest Ryu
10 months
Public service announcement: Multimodal LLMs are really bad at understanding images with *precision*. https://t.co/X83vFAcmCR A thread๐Ÿงต: 1/13.
@lukeprog
Luke Muehlhauser
10 months
Tyler Cowen: "I've seen enough, I'm calling it, o3 is AGI" Meanwhile, o3 in response to the first prompt I give it:
1
11
51