Leon Liangyu Chen @cliangyu_ X Profile

Leon Liangyu Chen

@cliangyu_

Followers

853

Following

1K

Media

24

Statuses

207

PhD student @stanfordailab, intern @Meta Superintelligence Labs

Stanford, CA

Joined June 2017

Don't wanna be here? Send us removal request.

Leon Liangyu Chen

@cliangyu_

13 days

RT @fengyao1909: Failing on 𝐥𝐚𝐫𝐠𝐞-𝐬𝐜𝐚𝐥𝐞 𝐑𝐋 with VeRL?. ⚠️ Mixing inference backend (𝐯𝐋𝐋𝐌/𝐒𝐆𝐋𝐚𝐧𝐠) with training backends (𝐅𝐒𝐃𝐏/𝐌𝐞𝐠𝐚𝐭𝐫𝐨𝐧) 𝐬𝐞𝐜….

0

108

0

Leon Liangyu Chen

@cliangyu_

13 days

RT @gneubig: Summary of GPT-OSS architectural innovations:. 1. sliding window attention (ref: .2. mixture of expert….

arxiv.org

Deploying Large Language Models (LLMs) in streaming applications such as multi-round dialogue, where long interactions are expected, is urgently needed but poses two major challenges. Firstly,...

0

360

0

Leon Liangyu Chen

@cliangyu_

14 days

RT @Trinkle23897: Harmony format is finally open-sourced. I still remember 3 years ago (before ChatGPT release) @shengjia_zhao, Daniel and….

github.com

Renderer for the harmony response format to be used with gpt-oss - openai/harmony

0

157

0

Leon Liangyu Chen

@cliangyu_

22 days

Claude code sometimes writes print statements to "mock" something and then claim it implemented. Interestingly, it knows this is wrong, but still does occasionally. I don't think this behaviour is reward hacking. What can it be?.

2

0

3

Leon Liangyu Chen

@cliangyu_

1 month

RT @Cohere_Labs: We’re excited to share that our work “Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement” w….

0

4

0

Leon Liangyu Chen

@cliangyu_

1 month

RT @sedrickkeh2: 📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀. OpenThinker3-1.5B is a smaller vers….

0

31

0

Leon Liangyu Chen

@cliangyu_

1 month

RT @alexgshaw: Evaluating agents on benchmarks is a pain. Each benchmark comes with its own harness, scoring scripts, and environments and….

0

22

0

Leon Liangyu Chen

@cliangyu_

1 month

DevOps issues, beyond SWE, like environment configuration, cause a lot of headaches. We build terminal-bench for LLMs to hillclimb, solving intricate CLI problems.

Mike A. Merrill

@Mike_A_Merrill

1 month

Terminal-Bench and @warpdotdev @zachlloydtweets in TechCrunch today :) (link in replies). I firmly believe that the future of LLM-Computer interaction is through something that looks like a terminal interface. Great to see this picking up steam.

0

5

Leon Liangyu Chen

@cliangyu_

1 month

RT @Yuchenj_UW: @karpathy @tszzl Very insightful. The new training paradigm (can be called “lesson-based learning”) can be a self-supervis….

0

8

0

Leon Liangyu Chen

@cliangyu_

1 month

RT @Mike_A_Merrill: It's great to see Terminal-Bench on the Kimi K2 model card. We love open source models, and just made it even easier to….

0

4

0

Leon Liangyu Chen

@cliangyu_

1 month

RT @agihippo: Academia should pivot to just being a school to train people to do research focusing on engineering and being able to work /….

0

1

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @LongTonyLian: Excited to share that Describe Anything has been accepted at ICCV 2025! 🎉. Describe Anything Model (DAM) is a powerful Mu….

0

26

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @andykonwinski: Today, I’m launching a deeply personal project. I’m betting $100M that we can help computer scientists create more upsid….

0

121

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @YutongBAI1002: What would a World Model look like if we start from a real embodied agent acting in the real world?. It has to have: 1)….

0

127

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @MercatJean: We evaluated more than 1000 reasoning LLMs on 12 reasoning-focused benchmarks and made fascinating observations about cross….

0

19

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @Zoya_ai: Reality is cooked. No one can spot AI videos anymore. 10 wild examples:. 1. Kitty Olympics, looks 100% real, but it’s fully AI….

0

5K

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @xuandongzhao: 🔥 Computer-use agents are a rising trend in AI research & industry. The problem? Current datasets are expensive and unsc….

0

1

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @shah_suket: The most underrated talk in all of YC Startup School was by @Jacob_Heller. The 3 steps to build in AI B2B from the CEO of C….

0

61

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @snowmaker: Thought provoking points from @karpathy at AI Startup School today. Made me see LLMs in a whole new light. .

0

158

0

Leon Liangyu Chen

@cliangyu_

2 months

RT @Mike_A_Merrill: Many agents (Claude Code, Codex CLI) interact with the terminal to do valuable tasks, but do they currently work well e….

0

64

0