Xiao Liu (Shaw) @ShawLiu12 X Profile

Xiao Liu (Shaw)

@ShawLiu12

Followers

537

Following

352

Media

19

Statuses

115

PhD @Tsinghua @THUKEG Developing P-Tuning, ChatGLM, AgentBench, and AutoGLM. 📖 Sharing paper digest on LLMs.

Joined October 2020

Don't wanna be here? Send us removal request.

Xiao Liu (Shaw)

@ShawLiu12

1 year

🚨Thrilled to present VisualAgentBench (VAB) with @yugu_nlp and Tianjie, where we enable both TRAINING & TESTING of visual foundation agents across 5 different environments! .In all 17 large multimodal models (LMMs) are tested. Find our paper, data, and more insights below 👇

1

16

48

Xiao Liu (Shaw)

@ShawLiu12

2 days

RT @Trinkle23897: Finally. OAI internally talked about releasing open-source model since 2022 and we got close a few times since then. No….

0

4

0

Xiao Liu (Shaw)

@ShawLiu12

2 days

RT @Trinkle23897: Harmony format is finally open-sourced. I still remember 3 years ago (before ChatGPT release) @shengjia_zhao, Daniel and….

github.com

Renderer for the harmony response format to be used with gpt-oss - openai/harmony

0

156

0

Xiao Liu (Shaw)

@ShawLiu12

3 days

RT @lmarena_ai: 🔥BREAKING: @Zai_org’s GLM-4.5 enters the top-5 in Arena!. With 4K+ community votes, it now ranks #5 Overall in the Text Are….

0

26

0

Xiao Liu (Shaw)

@ShawLiu12

9 days

RT @sam_paech: s GLM-4.5 gets a very strong result on EQ-Bench & Longform Writing. In creative writing it's a litt….

0

23

0

Xiao Liu (Shaw)

@ShawLiu12

11 days

RT @casper_hansen_: o3 competitor: GLM 4.5 by Zhipu AI.- hybrid reasoning model (on by default).- trained on 15T tokens.- 128k context, 96k….

0

105

0

Xiao Liu (Shaw)

@ShawLiu12

11 days

Best open model ever, try it now.

Z.ai

@Zai_org

11 days

Introducing GLM-4.5 and GLM-4.5 Air: new flagship models designed to unify frontier reasoning, coding, and agentic capabilities. GLM-4.5: 355B total / 32B active parameters.GLM-4.5-Air: 106B total / 12B active parameters. API Pricing (per 1M tokens):.GLM-4.5: $0.6 Input / $2.2

0

4

Xiao Liu (Shaw)

@ShawLiu12

6 months

#Meta researchers have unveiled MLGym-Bench, the most comprehensive framework yet for evaluating the intelligence of LLMs in AI research. First-ever ML gym environment spanning CV, NLP, RL & game theory with 13 diverse tasks. Even GPT-4o & Claude-3.5 struggle with true

1

9

31

Xiao Liu (Shaw)

@ShawLiu12

6 months

RT @teortaxesTex: First time I see a Xiaomi AI paper. Natural focus on mobile GUI flows. Waiting for their first work with Luo Fuli on boar….

0

2

0

Xiao Liu (Shaw)

@ShawLiu12

6 months

🔥 Chinese top smartphone producer #Xiaomi unveils ReachAgent, a mobile AI agent framework . 🚀 Boosts step-level IoU & accuracy by rethinking how agents handle GUI tasks. Breaking tasks into subtasks + a 2-stage process = smarter, faster results! 🧠📱#AI #LLM #AgenticAI #AGI

2

15

54

Xiao Liu (Shaw)

@ShawLiu12

6 months

RT @CunxiangWang: 很荣幸全程参与了在lmarena中位列Top9的新Zhipu GLM的诞生过程。不过如果能早几天出结果就更好了🤣.Honored to fully participate in the birth of the new Zhipu GLM,….

0

2

0

Xiao Liu (Shaw)

@ShawLiu12

6 months

#Apple uses RL to boost a 3.2B LLM phone-use agent to outperform #OpenAI o1 by 9%. Focusing on solving the problem of IDAs’ poor performance in executing complex tasks, especially in digital environments that require multi-step interactions and state management. It addresses the

0

6

Xiao Liu (Shaw)

@ShawLiu12

6 months

Diving into the world of LLM agents! 🚀 . Starting today, I'll share insights from the newest and sharpest papers I read. The agentic AI wave is rising—2025-2026 will be game-changing. Let’s explore, learn, and shape the future together! 🔥 #LLM #AgenticAI.

0

1

26

Xiao Liu (Shaw)

@ShawLiu12

7 months

RT @rohanpaul_ai: Self-play with tree-search helps LLMs learn instructions-following capability. SPAR introduces a self-play framework tha….

0

24

0

Xiao Liu (Shaw)

@ShawLiu12

8 months

congrats！.

1

0

1

Xiao Liu (Shaw)

@ShawLiu12

8 months

RT @aigclink: 智谱在今天的OpenDay上发布了其全新Agent产品系列：升级版AutoGLM、AutoGLM-Web、GLM-PC，分别对应手机、浏览器和电脑，实现AI对各种设备的智能操控. 1、升级后的AutoGLM基本上可以完成超过50步的复杂任务，支持跨….

0

20

0

Xiao Liu (Shaw)

@ShawLiu12

9 months

RT @dair_ai: The Top ML Papers of the Week (Nov 4 - 10):. - WebRL.- Magentic-One.- Personalization of LLMs.- Survey of Small Language Model….

0

63

0

Xiao Liu (Shaw)

@ShawLiu12

9 months

RT @dair_ai: 8). WebRL - proposes a self-evolving online curriculum RL framework to bridge the gap between open and proprietary LLM-based w….

0

3

0

Xiao Liu (Shaw)

@ShawLiu12

9 months

RT @ChatGLM: 🚀 Introducing AutoGLM! A new milestone in the ChatGLM family, AutoGLM is here to enable foundation agents for autonomous contr….

arxiv.org

We present AutoGLM, a new series in the ChatGLM family, designed to serve as foundation agents for autonomous control of digital devices through Graphical User Interfaces (GUIs). While foundation...

0

7

0

Xiao Liu (Shaw)

@ShawLiu12

9 months

RT @omarsar0: Proposes a self-evolving online curriculum RL framework to bridge the gap between open and proprietary LLM-based web agents.….

0

63

0

Xiao Liu (Shaw)

@ShawLiu12

9 months

RT @papers_anon: AutoGLM: Autonomous Foundation Agents for GUIs. Focuses on Web Browser and Android as the representative GUI scenarios. Fo….

0

1

0