
Jieyu Zhang @ CVPR
@JieyuZhang20
Followers
662
Following
243
Media
12
Statuses
97
PhD student @uwcse | Undergrad @IllinoisCS | Intern @allen_ai @MSFTResearch @SFResearch | Apple Scholar in AI/ML (‘24) | Data-centric AI/ML
Seattle, WA
Joined November 2022
Excited to share my intern project at Salesforce Research! Huge thanks to everyone on the team!!.
🔬🔬🔬Introducing ProVision: A new system for transforming images into verified instruction data for multimodal language models (MLMs) at massive scale! .Scene graphs + programmatic synthesis generate 10M+ diverse, automated Q&A pairs. Fully verifiable. Training MLMs? Dive in:
0
16
81
Tokenization kickstarts every Transformer pipeline—shaping how models digest data. Our latest work introduces semantic, grounded video tokenization, leveraging objectness cues to boost efficiency and performance of video understanding models.
Having trouble dealing with the excessive token number when processing a video? Check out our paper that is accepted by ICCV 2025 with an average score of 5.5! We tokenize video with tokens grounded in trajectories of all objects rather than fix-sized patches. Trained with a
0
2
15
RT @ManlingLi_: Can VLMs build Spatial Mental Models like humans?. Reasoning from limited views?.Reasoning from partial observations?.Reaso….
0
58
0
Calling all #CVPR2025 attendees!. Join us at the SynData4CV Workshop at @CVPR (Jun 11 full day at Grand C2, starting at 9am) to learn more about recent advancements in synthetic data for CV!. Explore more:
syndata4cv.github.io
[“CVPR 2025 Workshop”, “June 11th, 2025, Grand C2”, “Nashville, TN, United States”]
The 2nd Synthetic Data for Computer Vision workshop at @CVPR! We had a wonderful time last year, and we want to build on that success by fostering fresh insights into synthetic data for CV. Join us!. We welcome submissions! Please consider submitting your work! (deadline: March.
0
8
14
RT @ehsanik: I’ll be at CVPR giving two talks at Synthetic Data for Computer Vision Workshop and 3D Scene Understanding for Vision. I’m s….
0
5
0
RT @ajratner: Agentic AI will transform every enterprise–but only if agents are trusted experts. The key: Evaluation & tuning on specializ….
0
76
0
Nice done! Stay tuned for the model weights 😀.
Tool-using LLMs can learn to reason—without reasoning traces. 🔥 We present Nemotron-Research-Tool-N1, a family of tool-using reasoning LLMs trained entirely via rule-based reinforcement learning—no reasoning supervision, no distillation. 📄 Paper: 💻
0
0
10
RT @ManlingLi_: We are very excited announcing our MLL lab!. We are looking for collaborators on RAGEN, VAGEN, Chain-of-experts, T*, LongVi….
0
50
0
RT @ShaokunZhang1: (ICML 2025 Spotlight-Top 2.6%) Multi-agent LLM systems still fail—but who caused it, and when?. 🔥 We introduce Who&When,….
0
37
0
We are recruiting reviewers for our workshop! Please fill out this form if you're interested!
docs.google.com
We are inviting self-nominations for Reviewers for the 2nd Synthetic Data for Computer Vision Workshop @ CVPR 2025. You should have published papers in top computer vision or machine learning...
The 2nd Synthetic Data for Computer Vision workshop at @CVPR! We had a wonderful time last year, and we want to build on that success by fostering fresh insights into synthetic data for CV. Join us!. We welcome submissions! Please consider submitting your work! (deadline: March.
0
3
6
The 2nd Synthetic Data for Computer Vision workshop at @CVPR! We had a wonderful time last year, and we want to build on that success by fostering fresh insights into synthetic data for CV. Join us!. We welcome submissions! Please consider submitting your work! (deadline: March.
3
9
25
RT @allen_ai: We took our most efficient model and made an open-source iOS app📱but why?. As phones get faster, more AI will happen on devic….
0
107
0
RT @VentureBeat: Breaking the data bottleneck: Salesforce's ProVision speeds multimodal AI training with image scene graphs .
venturebeat.com
Salesforce is using structured representation of image semantics to power programs that synthesize instruction datasets for AI training.
0
6
0
Thank you!! @VentureBeat and @MarkTechPost.
We're so excited to see @VentureBeat and @MarkTechPost cover ProVision! We're tackling the visual instruction data challenge with scene graphs + human-written programs, already seeing 3-8% improvements across benchmarks. The real win? A more #OpenSourced, reproducible approach.
0
1
8
RT @linxins2: 🚀 Excited to share our new paper!. We show how instruction templates hugely impact Multimodal Language Models performance and….
0
18
0
RT @twelve_labs: The webinar recording with @JieyuZhang20, Yiqi Zhong, and @MuCai7 is up!. Watch here: 📺. They disc….
0
1
0
RT @zixianma02: Here’re some examples of 🌮TACO🌮.Find more examples on our:.🌐Website: 💻Demo: .
0
18
0
RT @SFResearch: 🌮 Introducing 🌮 TACO - our new family of multimodal action models that combine reasoning with real-world actions to solve c….
0
58
0