Junting Pan @ICCV 2025 @junting9 X Profile

Junting Pan @ICCV 2025

@junting9

Followers

928

Following

2K

Media

12

Statuses

136

Research Scientist @ Apple AIML | Prev: PhD@MMLab CUHK, Intern @AIatMeta (FAIR) and @samsungresearch. Working on foundation models.

https://t.co/McTpTImfEd

Joined December 2013

Don't wanna be here? Send us removal request.

Junting Pan @ICCV 2025

@junting9

1 year

Our MathVision benchmark is accepted by NeurIPS DB Track, 2024! We show a notable performance gap between current LMMs and human performance on simple math problems with visual context. Dataset: https://t.co/vhCqwCHwSU Paper: https://t.co/GB2nyWxoGp

1

16

Junting Pan @ICCV 2025

@junting9

5 days

The Foundation Model Team @🍎Apple AI/ML is looking for a Research Intern (flexible start date) to work on Multimodal LLMs and Vision-Language. Interested? DM me to learn more!

28

20

464

Alaa El-Nouby

@alaa_nouby

27 days

Last year at @Apple MLR, we published a number of interesting papers like AIM, AIMv2, and Scaling laws for: Sparsity, Native Multimodal Models, Data mixing. Today the team has open-sourced the training codebase we used for conducting this research! https://t.co/WNvOWMkgm3

github.com

Large multi-modal models (L3M) pre-training. Contribute to apple/ml-l3m development by creating an account on GitHub.

4

57

448

Ruoming Pang

@ruomingpang

3 months

In this report we describe the 2025 Apple Foundation Models ("AFM"). We also introduce the new Foundation Models framework, which gives app developers direct access to the on-device AFM model. https://t.co/nEbtxuGrjD

machinelearning.apple.com

We introduce two multilingual, multimodal foundation language models that power Apple Intelligence features across Apple devices and…

514

94

460

Phillip Isola

@phillip_isola

4 months

Our computer vision textbook is now available for free online here: https://t.co/ERy2Spc7c2 We are working on adding some interactive components like search and (beta) integration with LLMs. Hope this is useful and feel free to submit Github issues to help us improve the text!

visionbook.mit.edu

35

620

3K

Nikhila Ravi

@nikhilaravi

6 months

🌟Thrilled to share that SAM 2 was awarded a Best Paper Honourable Mention Award at #ICLR2025, one of 6 papers recognized out of 11000+ submissions! 👏This project was the result of amazing work by an exceptional team at @AIatMeta FAIR: @vgabeur , @YuanTingHu1,@RonghangHu,

ICLR 2026

@iclr_conf

6 months

Honorable Mentions Data Shapley in One Training Run. Jiachen T. Wang, et al. SAM 2: Segment Anything in Images and Videos. Nikhila Ravi, et al. Faster Cascades via Speculative Decoding. Harikrishna Narasimhan, et al.

3

15

107

Miguel Angel Bautista

@itsbautistam

1 year

I am looking for strong PhD interns to join Apple MLR late 2024 or early 2025! Topics will be around training large-scale diffusion/flow matching models broadly speaking and you’ll be in the bay area (Cupertino/SF). Apply here: https://t.co/5gKIBnK6oP. [1/5]

4

69

368

Haoxuan You

@XyouH

1 year

Looking for a 2025 summer research intern, in the Foundation Model Team at Apple AI/ML, with the focus of Multimodal LLM / Vision-Language. Phd preferred. Apply through https://t.co/m243cnfXay Also email me your resume to haoxuanyou@gmail.com! 😊

16

69

432

Junting Pan @ICCV 2025

@junting9

1 year

Nice to e-meet everyone at #AdobeMAX sneak! @mengweir --- Now I know 😎

Mengwei Ren

@mengweir

1 year

So much fun at #AdobeMAX sneak! As a researcher, this is so far the biggest stage I have ever stepped on. Turns out it’s easier to interact with 10k+ audiences than with 1 for an introvert :p Grateful for all the applause, energy, and suports!

0

5

Mengwei Ren

@mengweir

1 year

I am presenting at #AdobeMAX next week! Get a sneak peak to our latest research on image composition and relighting on Oct15th at MAX sneak session (5.30 to 7 pm EST). Online registration (free): https://t.co/d2VCC0rFzO

1

2

14

Junting Pan @ICCV 2025

@junting9

1 year

SAM2 is truly amazing! I am so proud to have been a part of this incredible team!

AI at Meta

@AIatMeta

1 year

Introducing Meta Segment Anything Model 2 (SAM 2) — the first unified model for real-time, promptable object segmentation in images & videos. SAM 2 is available today under Apache 2.0 so that anyone can use it to build their own experiences Details ➡️ https://t.co/eTTDpxI60h

1

38

Yann LeCun

@ylecun

2 years

🥁 Llama3 is out 🥁 8B and 70B models available today. 8k context length. Trained with 15 trillion tokens on a custom-built 24k GPU cluster. Great performance on various benchmarks, with Llam3-8B doing better than Llama2-70B in some cases. More versions are coming over the next

212

1K

7K

Junting Pan @ICCV 2025

@junting9

2 years

Attempt #3

0

Junting Pan @ICCV 2025

@junting9

2 years

Attempt #2

1

0

Junting Pan @ICCV 2025

@junting9

2 years

Attempt #1

1

0

Junting Pan @ICCV 2025

@junting9

2 years

I'm curious why GPT-4 Vision can't solve simple tasks like counting missing bricks. Any insights? #LLM #GPT4 #GPTS

2

0

4

Xiaohua Zhai

@XiaohuaZhai

2 years

📢📢 I am looking for a student researcher to work with me and my colleagues at Google DeepMind Zürich on vision-language research. It will be a 100% 24 weeks onsite position in Switzerland. Reach out to me (xzhai@google.com) if interested. Bonus: amazing view🏔️👇

6

29

242

Mengwei Ren

@mengweir

2 years

Glad to share that our project Relightful Harmonization: Lighting-aware portrait background replacement has been accepted to #CVPR2024. 🧵 Project page: https://t.co/XRl3tWBzVw Preprint: https://t.co/xMZspSdvxx

8

42

176

Xin Eric Wang

@xwang_lk

2 years

Muffin or Chihuahua in a multipanel image? Most people can, but GPT-4V struggles! Contrary to popular belief that only experts can outperform (Multimodal) LLMs, average humans often prove to be more intelligent. Our Multipanel VQA study reveals this gap, where human accuracy

Yue Fan

@YFan_UCSC

2 years

Distinguish muffins from chihuahuas in a multipanel web screenshot? No problem for humans (99% accuracy), but hard for Large Vision-Language Models (LVLMs) (39-72% accuracy)! To find out how LVLMs do and what affects their ability regarding multipanel image understanding, we

2

16

93

The Chinese University of Hong Kong - CUHK

@CUHKofficial

2 years

#CUHK has learned with deep sorrow about the passing of Prof Tang Xiaoou of the Department of Information Engineering. Prof Tang joined CUHK since 1998 and was one of the most influential scientists working in AI field. Details: https://t.co/lgBOc98iPA

5

9

61