Terry Yue Zhuo @terryyuezhuo X Profile

Terry Yue Zhuo

@terryyuezhuo

Followers

2K

Following

9K

Media

149

Statuses

1K

@BigCodeProject-{⚔️Arena, 📊Bench} | Going Stealth | @codelm_tutorial EMNLP’25

https://t.co/sSwKkiT7KH

Joined May 2020

Don't wanna be here? Send us removal request.

Terry Yue Zhuo

@terryyuezhuo

2 months

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

BigCode

@BigCodeProject

2 months

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

11

37

129

Terry Yue Zhuo

@terryyuezhuo

19 days

Basically that’s what I’ve been working on.

Anthropic

@AnthropicAI

19 days

We disrupted a highly sophisticated AI-led espionage campaign. The attack targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We assess with high confidence that the threat actor was a Chinese state-sponsored group.

0

2

19

Terry Yue Zhuo

@terryyuezhuo

29 days

When models get stronger, the scaffoldings will be more simplified.

0

1

Terry Yue Zhuo

@terryyuezhuo

29 days

As models become more accessible, it’s impossible to completely prevent malicious use, so systems need to anticipate how ppl might use those models to attack them.

0

2

Terry Yue Zhuo

@terryyuezhuo

1 month

Let’s do a survey of those good and bad surveys.

will brown

@willccbb

2 months

can we please be serious

0

3

Terry Yue Zhuo

@terryyuezhuo

1 month

Every benchmark can have a live version.

0

7

Terry Yue Zhuo

@terryyuezhuo

2 months

@qinkai1028 @Zai_org

huggingface.co

0

1

Terry Yue Zhuo

@terryyuezhuo

2 months

GLM-4.6 is now live on BigCodeArena. Shout-out to @qinkai1028 and the whole @Zai_org team for this great model!

Terry Yue Zhuo

@terryyuezhuo

2 months

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

1

2

26

AK

@_akhaliq

2 months

BigCodeArena Unveiling More Reliable Human Preferences in Code Generation via Execution

1

10

62

Tanishq Abraham @ NeurIPS

@iScienceLuvr

2 months

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution "we introduce BigCodeArena, an open human evaluation platform for code generation backed by a comprehensive and on-the-fly execution environment. Built on top of Chatbot Arena, BigCodeArena

3

6

58

Terry Yue Zhuo

@terryyuezhuo

2 months

Vibe coding is small in scope but big in impact. It’s how you learn what actually feels good to build. Those quick React prototypes are just a small part of the bigger picture. Think bigger.

0

7

Terry Yue Zhuo

@terryyuezhuo

2 months

Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"

BigCode

@BigCodeProject

2 months

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

0

2

7

Terry Yue Zhuo

@terryyuezhuo

2 months

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

BigCode

@BigCodeProject

2 months

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

11

37

129

Terry Yue Zhuo

@terryyuezhuo

2 months

https://t.co/u1AwUBVbMS

Terry Yue Zhuo

@terryyuezhuo

2 months

Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"

0

3

Terry Yue Zhuo

@terryyuezhuo

2 months

Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"

BigCode

@BigCodeProject

2 months

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

0

2

7

Wasi Ahmad

@ahmadwasi

2 months

Try BigCodeArena with different frontier models!

Terry Yue Zhuo

@terryyuezhuo

2 months

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

0

1

Terry Yue Zhuo

@terryyuezhuo

2 months

@karpathy should really see this😎

0

3

Terry Yue Zhuo

@terryyuezhuo

2 months

cc @altryne @ivanfioravanti @qinkai1028 @olafgeibig who were curious about this. Sorry for the delay!

2

0

3

Noah Ziems

@NoahZiems

2 months

Excited to have helped out on BigCodeArena led by @terryyuezhuo !

Terry Yue Zhuo

@terryyuezhuo

2 months

It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.

1

2

18

Terry Yue Zhuo

@terryyuezhuo

2 months

Special thanks to @abidlabs @clefourrier from @huggingface team, @mlejva from @e2b, @hyperbolic_labs team, and @Alibaba_Qwen team!🤗

1

0

3

BigCode

@BigCodeProject

2 months

Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.

4

29

79