YueYangAI Profile Banner
Yue Yang Profile
Yue Yang

@YueYangAI

Followers
590
Following
159
Media
31
Statuses
109

Research scientist @allen_ai | PhD @upennnlp | Vision and Language

Joined July 2018
Don't wanna be here? Send us removal request.
@realliyifei
Li S. Yifei
2 months
How well can LLMs & deep research systems synthesize long-form answers to *thousands of research queries across diverse domains*? Excited to announce 🎓📖 ResearchQA: a large-scale benchmark to evaluate long-form scholarly question answering at scale across 75 fields, using
1
24
61
@allen_ai
Ai2
3 months
🤖✨ What if models that take action in the physical world could think through your instructions? Meet MolmoAct, our new fully open Action Reasoning Model (ARM) that does just that. 🧵
15
81
341
@jeffrey_ch0
Jeffrey (Young-Min) Cho
6 months
🤖💬 Herding instincts… in AIs? Yes, even LLMs can follow the crowd! • 📉 Conformity ↑ when agents lack confidence but trust peers • 🧠 Presentation format shapes peer influence • 🎯 Controlled herding can boost collaboration outcomes 👉 Read more: https://t.co/Ym0rtKyVzH
0
8
13
@YueYangAI
Yue Yang
6 months
Successfully defended my PhD thesis and got hooded this week! Thanks to all the friends who supported me throughout this incredible journey! Excited to join PRIOR at @allen_ai next and continue exploring open vision-language research!
16
4
155
@YueYangAI
Yue Yang
6 months
🎉CoSyn is accepted by ACL2025!
@YueYangAI
Yue Yang
9 months
We share Code-Guided Synthetic Data Generation: using LLM-generated code to create multimodal datasets for text-rich images, such as charts📊, documents📄, etc., to enhance Vision-Language Models. Website: https://t.co/9IQ4CgeKMF Dataset: https://t.co/yiERrZup8X Paper:
0
0
7
@jeffrey_ch0
Jeffrey (Young-Min) Cho
7 months
#NAACL2025 How to compare cultural differences with social media data in scale? Our work uses lexica to annotate X 🇺🇸 & Weibo 🇨🇳 posts with valence (😄☹️) & arousal (🔥❄️) scores, revealing cross-cultural differences in emotional expression. https://t.co/2tNFceO9GD
Tweet card summary image
aclanthology.org
Young Min Cho, Dandan Pang, Stuti Thapa, Garrick Sherman, Lyle Ungar, Louis Tay, Sharath Chandra Guntuku. Findings of the Association for Computational Linguistics: NAACL 2025. 2025.
0
4
13
@AnnieFeng6
Yu Feng
7 months
#ICLR2025 Oral LLMs often struggle with reliable and consistent decisions under uncertainty 😵‍💫 — largely because they can't reliably estimate the probability of each choice. We propose BIRD 🐦, a framework that significantly enhances LLM decision making under uncertainty. BIRD
2
40
259
@artemispng
Artemis Panagopoulou
9 months
Exciting news! 🎉 Our paper “ViUniT: Visual Unit Tests for More Robust Visual Programming” got accepted at #CVPR2025
@SFResearch
Salesforce AI Research
9 months
🎉Just Announced: "ViUniT: Visual Unit Tests for More Robust Visual Programming" has been accepted at #CVPR2025! Paper Link: https://t.co/nbLc1yq991 Project Page: https://t.co/rH9Z9uMMKC Researcher’s walk-through 👇 In collaboration with @UPenn, we introduce ViUniT, a framework
0
2
17
@codezakh
Zaid Khan
9 months
✨ Introducing MutaGReP (Mutation-guided Grounded Repository Plan Search) - an approach that uses LLM-guided tree search to find realizable plans that are grounded in a target codebase without executing any code! Ever wanted to provide an entire repo containing 100s of 1000s of
1
39
87
@YueYangAI
Yue Yang
9 months
This work is done during my great summer internship at @allen_ai with my awesome collaborators: Ajay Patel, @mattdeitke, @tanmay2099, @LucaWeihs, @drewmikehead, @yatskar, Chris Callison-Burch, @RanjayKrishna, @anikembhavi, Christopher Clark.
0
0
4
@YueYangAI
Yue Yang
9 months
We also show we can create synthetic pointing data to improve the click accuracy of VLMs in GUI agent tasks. On the ScreenSpot click prediction benchmark, our model trained on synthetic pointing data can outperform existing methods with much less training data.
1
0
5
@YueYangAI
Yue Yang
9 months
We notice open VLMs struggle with novel out-of-domain tasks like interpreting nutrition labels. However, CoSyn’s controllable data generation can create targeted synthetic data for task-specific fine-tuning, achieving strong zero-shot performance with significantly less data.
1
0
4
@YueYangAI
Yue Yang
9 months
On 7 text-rich benchmarks (e.g., ChartQA, DocVQA), our model trained on synthetic data outperforms competitive open and proprietary VLMs. Our zero-shot model, trained without benchmark examples, beats most baselines, proving the generalizability of training on synthetic data.
2
0
4
@YueYangAI
Yue Yang
9 months
CoSyn uses code as the intermediate representation to build synthetic multimodal datasets. We prompt a text-only LLM to generate code that renders images, and then we use code as context to create instruction-tuning data, such as QA pairs, for fine-tuning VLMs.
1
0
5
@YueYangAI
Yue Yang
9 months
Our CoSyn framework integrates 11 rendering tools for 20 robust generation pipelines, which support the creation of diverse text-rich images, including charts, documents, diagrams, tables, and even music sheets 🎼, and many more!
1
0
5
@YueYangAI
Yue Yang
9 months
We share Code-Guided Synthetic Data Generation: using LLM-generated code to create multimodal datasets for text-rich images, such as charts📊, documents📄, etc., to enhance Vision-Language Models. Website: https://t.co/9IQ4CgeKMF Dataset: https://t.co/yiERrZup8X Paper:
6
48
196
@LongLeRobot
Long Le
10 months
Articulate Anything has just been accepted to @iclr_conf #ICLR2025 ! Looking forward to seeing everyone in Singapore 🇸🇬 🙀❤️!
@LongLeRobot
Long Le
11 months
📦 Can frontier AI transform ANY physical object from ANY input modality into a high-quality digital twin that also MOVES? Excited to share our work,Articulate-Anything, exploring how large vision-language models (VLMs) can bridge the gap between the physical and digital
3
8
44
@Ai2Prior
Prior @ AI2
1 year
📢Applications are open for summer'25 internships at the PRIOR (computer vision) team @allen_ai: Come join us in building large-scale models for: 📸 Open-source Vision-Language Models 💻 Multimodal Web Agents 🤖 Embodied AI + Robotics 🌎 Planet Monitoring Apply by December
1
13
47
@cmalaviya11
Chaitanya Malaviya
1 year
Excited to share ✨ Contextualized Evaluations ✨! Benchmarks like Chatbot Arena contain underspecified queries, which can lead to arbitrary eval judgments. What happens if we provide evaluators with context (e.g who's the user, what's their intent) when judging LM outputs? 🧵↓
2
31
122
@veronica3207
Veronica Qing Lyu
1 year
🤔What model explanation method should you use? How to ensure it reflects the model’s true reasoning? 🌟 In our CL survey, Towards Faithful Model Explanation in NLP, we review 110+ explainability methods through the lens of faithfulness. Check out my presentation at #EMNLP2024!
1
8
33