Yue Yang @YueYangAI X Profile

Yue Yang

@YueYangAI

Followers

590

Following

159

Media

31

Statuses

109

Research scientist @allen_ai | PhD @upennnlp | Vision and Language

https://t.co/8aHZPCeeQ3

Joined July 2018

Don't wanna be here? Send us removal request.

Li S. Yifei

@realliyifei

2 months

How well can LLMs & deep research systems synthesize long-form answers to *thousands of research queries across diverse domains*? Excited to announce 🎓📖 ResearchQA: a large-scale benchmark to evaluate long-form scholarly question answering at scale across 75 fields, using

1

24

61

Ai2

@allen_ai

3 months

🤖✨ What if models that take action in the physical world could think through your instructions? Meet MolmoAct, our new fully open Action Reasoning Model (ARM) that does just that. 🧵

15

81

341

Jeffrey (Young-Min) Cho

@jeffrey_ch0

6 months

🤖💬 Herding instincts… in AIs? Yes, even LLMs can follow the crowd! • 📉 Conformity ↑ when agents lack confidence but trust peers • 🧠 Presentation format shapes peer influence • 🎯 Controlled herding can boost collaboration outcomes 👉 Read more: https://t.co/Ym0rtKyVzH

0

8

13

Yue Yang

@YueYangAI

6 months

Successfully defended my PhD thesis and got hooded this week! Thanks to all the friends who supported me throughout this incredible journey! Excited to join PRIOR at @allen_ai next and continue exploring open vision-language research!

16

4

155

Yue Yang

@YueYangAI

6 months

🎉CoSyn is accepted by ACL2025!

Yue Yang

@YueYangAI

9 months

We share Code-Guided Synthetic Data Generation: using LLM-generated code to create multimodal datasets for text-rich images, such as charts📊, documents📄, etc., to enhance Vision-Language Models. Website: https://t.co/9IQ4CgeKMF Dataset: https://t.co/yiERrZup8X Paper:

0

7

Jeffrey (Young-Min) Cho

@jeffrey_ch0

7 months

#NAACL2025 How to compare cultural differences with social media data in scale? Our work uses lexica to annotate X 🇺🇸 & Weibo 🇨🇳 posts with valence (😄☹️) & arousal (🔥❄️) scores, revealing cross-cultural differences in emotional expression. https://t.co/2tNFceO9GD

aclanthology.org

Young Min Cho, Dandan Pang, Stuti Thapa, Garrick Sherman, Lyle Ungar, Louis Tay, Sharath Chandra Guntuku. Findings of the Association for Computational Linguistics: NAACL 2025. 2025.

0

4

13

Yu Feng

@AnnieFeng6

7 months

#ICLR2025 Oral LLMs often struggle with reliable and consistent decisions under uncertainty 😵‍💫 — largely because they can't reliably estimate the probability of each choice. We propose BIRD 🐦, a framework that significantly enhances LLM decision making under uncertainty. BIRD

2

40

259

Artemis Panagopoulou

@artemispng

9 months

Exciting news! 🎉 Our paper “ViUniT: Visual Unit Tests for More Robust Visual Programming” got accepted at #CVPR2025

Salesforce AI Research

@SFResearch

9 months

🎉Just Announced: "ViUniT: Visual Unit Tests for More Robust Visual Programming" has been accepted at #CVPR2025! Paper Link: https://t.co/nbLc1yq991 Project Page: https://t.co/rH9Z9uMMKC Researcher’s walk-through 👇 In collaboration with @UPenn, we introduce ViUniT, a framework

0

2

17

Zaid Khan

@codezakh

9 months

✨ Introducing MutaGReP (Mutation-guided Grounded Repository Plan Search) - an approach that uses LLM-guided tree search to find realizable plans that are grounded in a target codebase without executing any code! Ever wanted to provide an entire repo containing 100s of 1000s of

1

39

87

Yue Yang

@YueYangAI

9 months

This work is done during my great summer internship at @allen_ai with my awesome collaborators: Ajay Patel, @mattdeitke, @tanmay2099, @LucaWeihs, @drewmikehead, @yatskar, Chris Callison-Burch, @RanjayKrishna, @anikembhavi, Christopher Clark.

0

4

Yue Yang

@YueYangAI

9 months

We also show we can create synthetic pointing data to improve the click accuracy of VLMs in GUI agent tasks. On the ScreenSpot click prediction benchmark, our model trained on synthetic pointing data can outperform existing methods with much less training data.

1

0

5

Yue Yang

@YueYangAI

9 months

We notice open VLMs struggle with novel out-of-domain tasks like interpreting nutrition labels. However, CoSyn’s controllable data generation can create targeted synthetic data for task-specific fine-tuning, achieving strong zero-shot performance with significantly less data.

1

0

4

Yue Yang

@YueYangAI

9 months

On 7 text-rich benchmarks (e.g., ChartQA, DocVQA), our model trained on synthetic data outperforms competitive open and proprietary VLMs. Our zero-shot model, trained without benchmark examples, beats most baselines, proving the generalizability of training on synthetic data.

2

0

4

Yue Yang

@YueYangAI

9 months

CoSyn uses code as the intermediate representation to build synthetic multimodal datasets. We prompt a text-only LLM to generate code that renders images, and then we use code as context to create instruction-tuning data, such as QA pairs, for fine-tuning VLMs.

1

0

5

Yue Yang

@YueYangAI

9 months

Our CoSyn framework integrates 11 rendering tools for 20 robust generation pipelines, which support the creation of diverse text-rich images, including charts, documents, diagrams, tables, and even music sheets 🎼, and many more!

1

0

5

Yue Yang

@YueYangAI

9 months

We share Code-Guided Synthetic Data Generation: using LLM-generated code to create multimodal datasets for text-rich images, such as charts📊, documents📄, etc., to enhance Vision-Language Models. Website: https://t.co/9IQ4CgeKMF Dataset: https://t.co/yiERrZup8X Paper:

6

48

196

Long Le

@LongLeRobot

10 months

Articulate Anything has just been accepted to @iclr_conf #ICLR2025 ! Looking forward to seeing everyone in Singapore 🇸🇬 🙀❤️!

Long Le

@LongLeRobot

11 months

📦 Can frontier AI transform ANY physical object from ANY input modality into a high-quality digital twin that also MOVES? Excited to share our work,Articulate-Anything, exploring how large vision-language models (VLMs) can bridge the gap between the physical and digital

3

8

44

Prior @ AI2

@Ai2Prior

1 year

📢Applications are open for summer'25 internships at the PRIOR (computer vision) team @allen_ai: Come join us in building large-scale models for: 📸 Open-source Vision-Language Models 💻 Multimodal Web Agents 🤖 Embodied AI + Robotics 🌎 Planet Monitoring Apply by December

1

13

47

Chaitanya Malaviya

@cmalaviya11

1 year

Excited to share ✨ Contextualized Evaluations ✨! Benchmarks like Chatbot Arena contain underspecified queries, which can lead to arbitrary eval judgments. What happens if we provide evaluators with context (e.g who's the user, what's their intent) when judging LM outputs? 🧵↓

2

31

122

Veronica Qing Lyu

@veronica3207

1 year

🤔What model explanation method should you use? How to ensure it reflects the model’s true reasoning? 🌟 In our CL survey, Towards Faithful Model Explanation in NLP, we review 110+ explainability methods through the lens of faithfulness. Check out my presentation at #EMNLP2024!

1

8

33