TsingYoga Profile Banner
Yujia Qin Profile
Yujia Qin

@TsingYoga

Followers
5K
Following
249
Media
73
Statuses
334

ByteDancer, Agent, THU (16-20 BS in EE, 20-24 PhD in CS)

Beijing
Joined February 2019
Don't wanna be here? Send us removal request.
@TsingYoga
Yujia Qin
3 months
Introducing UI-TARS-1.5, a vision-language model that beats OpenAI Operator and Claude 3.7 on GUI Agent and Game Agent tasks. We've open-sourced a small-size version model for research purposes, more details can be found in our blog. TARS learns solely from a screen, but
18
48
206
@TsingYoga
Yujia Qin
3 days
RT @deedydas: China's Gaokao is the biggest exam in the world: 13M test takers and 9hrs. ~0.02% make it to the top uni, Tsinghua. As of thi….
0
204
0
@TsingYoga
Yujia Qin
8 days
Meet Agent TARS Beta, based on Seed1.5-VL.
@agent_tars
Agent TARS
8 days
Since we have released a brand new new Agent TARS CLI based on Seed1.5-VL, see , we have to say goodbye to the old Agent TARS Desktop
Tweet media one
2
2
22
@TsingYoga
Yujia Qin
11 days
RT @_ulivz: Introducing Agent TARS Beta — a brand new and more powerful Agent TARS!. - Agent TARS CLI.- Browser Agent driven by Seed-1.5-VL….
0
16
0
@TsingYoga
Yujia Qin
12 days
RT @_zhaoheh_: 🚀 UI-TARS Desktop v0.2.1 is now live!.Free Remote Computer & Browser Operator are ready to roll—no setup, just click and go🎁….
0
5
0
@TsingYoga
Yujia Qin
12 days
RT @_ulivz: We have a major release coming up
0
7
0
@TsingYoga
Yujia Qin
18 days
RT @_jasonwei: One way of thinking about what AI will automate first is via the “description-execution gap”: how much harder is it to descr….
0
45
0
@TsingYoga
Yujia Qin
22 days
RT @sainingxie: Had a great time at this CVPR community-building workshop---lots of fun discussions and some really important insights for….
0
61
0
@TsingYoga
Yujia Qin
1 month
RT @ysu_nlp: @DimitrisPapail I’d argue that computer use, in principle, is much harder than math/coding for current AI. the digital world e….
0
3
0
@TsingYoga
Yujia Qin
1 month
Guess it's the first open-source multi-turn e2e RL for GUI Agents from academia, and it's based on UI-TARS-1.5-7B. If you want to study multimodal Agent RL, it is a good startpoint~.
Tweet media one
9
67
437
@TsingYoga
Yujia Qin
1 month
The cua community is really amazing!!.
@dhruv2038
Dhruv
1 month
We now have local computer-use! M3 Pro 18GB running both UI-TARS-1.5-7B-6bit and a macOS sequoia VM entirely locally using MLX and c/ua at ~30second/action. Do it yourself here :
0
1
6
@TsingYoga
Yujia Qin
1 month
Interesting to know the previous operator is based on 4o, not even o1. OpenAI is shifting from reasoning models (o3) to agent models (operator, codex, and deepresearch), with gradual integration of agent data streams from multiple teams—evident in GAIA’s jump from 12.3 to 62.2.
@OpenAI
OpenAI
1 month
Operator 🤝 OpenAI o3. Operator in ChatGPT has been updated with our latest reasoning model.
3
3
61
@TsingYoga
Yujia Qin
2 months
RT @giffmana: OK, ByteDance Seed is now firmly a top tier lab in my mind. Congrats on many solid works recently, continuously publishing an….
0
39
0
@TsingYoga
Yujia Qin
2 months
Interesting findings on the behavior emerging from unified multimodal scaling.
@_akhaliq
AK
2 months
ByteDance just dropped BAGEL on Hugging Face. The Open-Source Unified Multimodal Model
0
3
27