Yujia Qin @TsingYoga X Profile

Yujia Qin

@TsingYoga

Followers

5K

Following

249

Media

73

Statuses

334

ByteDancer, Agent, THU (16-20 BS in EE, 20-24 PhD in CS)

Beijing

Joined February 2019

Don't wanna be here? Send us removal request.

Yujia Qin

@TsingYoga

3 months

Introducing UI-TARS-1.5, a vision-language model that beats OpenAI Operator and Claude 3.7 on GUI Agent and Game Agent tasks. We've open-sourced a small-size version model for research purposes, more details can be found in our blog. TARS learns solely from a screen, but

18

48

206

Yujia Qin

@TsingYoga

3 days

RT @deedydas: China's Gaokao is the biggest exam in the world: 13M test takers and 9hrs. ~0.02% make it to the top uni, Tsinghua. As of thi….

0

204

0

Yujia Qin

@TsingYoga

8 days

Meet Agent TARS Beta, based on Seed1.5-VL.

Agent TARS

@agent_tars

8 days

Since we have released a brand new new Agent TARS CLI based on Seed1.5-VL, see , we have to say goodbye to the old Agent TARS Desktop

2

22

Yujia Qin

@TsingYoga

11 days

RT @_ulivz: Introducing Agent TARS Beta — a brand new and more powerful Agent TARS!. - Agent TARS CLI.- Browser Agent driven by Seed-1.5-VL….

0

16

0

Yujia Qin

@TsingYoga

12 days

RT @_zhaoheh_: 🚀 UI-TARS Desktop v0.2.1 is now live!.Free Remote Computer & Browser Operator are ready to roll—no setup, just click and go🎁….

0

5

0

Yujia Qin

@TsingYoga

12 days

RT @_ulivz: We have a major release coming up

0

7

0

Yujia Qin

@TsingYoga

18 days

RT @_jasonwei: One way of thinking about what AI will automate first is via the “description-execution gap”: how much harder is it to descr….

0

45

0

Yujia Qin

@TsingYoga

22 days

RT @sainingxie: Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for….

0

61

0

Yujia Qin

@TsingYoga

1 month

RT @ysu_nlp: @DimitrisPapail I’d argue that computer use, in principle, is much harder than math/coding for current AI. the digital world e….

0

3

0

Yujia Qin

@TsingYoga

1 month

Guess it's the first open-source multi-turn e2e RL for GUI Agents from academia, and it's based on UI-TARS-1.5-7B. If you want to study multimodal Agent RL, it is a good startpoint~.

9

67

437

Yujia Qin

@TsingYoga

1 month

The cua community is really amazing!!.

Dhruv

@dhruv2038

1 month

We now have local computer-use! M3 Pro 18GB running both UI-TARS-1.5-7B-6bit and a macOS sequoia VM entirely locally using MLX and c/ua at ~30second/action. Do it yourself here :

0

1

6

Yujia Qin

@TsingYoga

1 month

Interesting to know the previous operator is based on 4o, not even o1. OpenAI is shifting from reasoning models (o3) to agent models (operator, codex, and deepresearch), with gradual integration of agent data streams from multiple teams—evident in GAIA’s jump from 12.3 to 62.2.

OpenAI

@OpenAI

1 month

Operator 🤝 OpenAI o3. Operator in ChatGPT has been updated with our latest reasoning model.

3

61

Yujia Qin

@TsingYoga

2 months

RT @giffmana: OK, ByteDance Seed is now firmly a top tier lab in my mind. Congrats on many solid works recently, continuously publishing an….

0

39

0

Yujia Qin

@TsingYoga

2 months

Interesting findings on the behavior emerging from unified multimodal scaling.

AK

@_akhaliq

2 months

ByteDance just dropped BAGEL on Hugging Face. The Open-Source Unified Multimodal Model

0

3

27