terryyuezhuo Profile Banner
Terry Yue Zhuo Profile
Terry Yue Zhuo

@terryyuezhuo

Followers
2K
Following
9K
Media
149
Statuses
1K

@BigCodeProject-{⚔️Arena, 📊Bench} | Going Stealth | @codelm_tutorial EMNLP’25

Joined May 2020
Don't wanna be here? Send us removal request.
@terryyuezhuo
Terry Yue Zhuo
2 months
It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.
@BigCodeProject
BigCode
2 months
Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.
11
37
129
@terryyuezhuo
Terry Yue Zhuo
19 days
Basically that’s what I’ve been working on.
@AnthropicAI
Anthropic
19 days
We disrupted a highly sophisticated AI-led espionage campaign. The attack targeted large tech companies, financial institutions, chemical manufacturing companies, and government agencies. We assess with high confidence that the threat actor was a Chinese state-sponsored group.
0
2
19
@terryyuezhuo
Terry Yue Zhuo
29 days
When models get stronger, the scaffoldings will be more simplified.
0
0
1
@terryyuezhuo
Terry Yue Zhuo
29 days
As models become more accessible, it’s impossible to completely prevent malicious use, so systems need to anticipate how ppl might use those models to attack them.
0
0
2
@terryyuezhuo
Terry Yue Zhuo
1 month
Let’s do a survey of those good and bad surveys.
@willccbb
will brown
2 months
can we please be serious
0
0
3
@terryyuezhuo
Terry Yue Zhuo
1 month
Every benchmark can have a live version.
0
0
7
@terryyuezhuo
Terry Yue Zhuo
2 months
GLM-4.6 is now live on BigCodeArena. Shout-out to @qinkai1028 and the whole @Zai_org team for this great model!
@terryyuezhuo
Terry Yue Zhuo
2 months
It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.
1
2
26
@_akhaliq
AK
2 months
BigCodeArena Unveiling More Reliable Human Preferences in Code Generation via Execution
1
10
62
@iScienceLuvr
Tanishq Abraham @ NeurIPS
2 months
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution "we introduce BigCodeArena, an open human evaluation platform for code generation backed by a comprehensive and on-the-fly execution environment. Built on top of Chatbot Arena, BigCodeArena
3
6
58
@terryyuezhuo
Terry Yue Zhuo
2 months
Vibe coding is small in scope but big in impact. It’s how you learn what actually feels good to build. Those quick React prototypes are just a small part of the bigger picture. Think bigger.
0
0
7
@terryyuezhuo
Terry Yue Zhuo
2 months
Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"
@BigCodeProject
BigCode
2 months
Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.
0
2
7
@terryyuezhuo
Terry Yue Zhuo
2 months
It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.
@BigCodeProject
BigCode
2 months
Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.
11
37
129
@terryyuezhuo
Terry Yue Zhuo
2 months
@terryyuezhuo
Terry Yue Zhuo
2 months
Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"
0
0
3
@terryyuezhuo
Terry Yue Zhuo
2 months
Just a classic one, but with PyGame this time 😛 "A ball bouncing inside a spinning hexagon, with the full control"
@BigCodeProject
BigCode
2 months
Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.
0
2
7
@ahmadwasi
Wasi Ahmad
2 months
Try BigCodeArena with different frontier models!
@terryyuezhuo
Terry Yue Zhuo
2 months
It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.
0
1
1
@terryyuezhuo
Terry Yue Zhuo
2 months
@karpathy should really see this😎
0
0
3
@terryyuezhuo
Terry Yue Zhuo
2 months
cc @altryne @ivanfioravanti @qinkai1028 @olafgeibig who were curious about this. Sorry for the delay!
2
0
3
@NoahZiems
Noah Ziems
2 months
Excited to have helped out on BigCodeArena led by @terryyuezhuo !
@terryyuezhuo
Terry Yue Zhuo
2 months
It’s so much fun working with the other 39 community members on this project! Start to try out various frontier models in BigCodeArena today.
1
2
18
@terryyuezhuo
Terry Yue Zhuo
2 months
Special thanks to @abidlabs @clefourrier from @huggingface team, @mlejva from @e2b, @hyperbolic_labs team, and @Alibaba_Qwen team!🤗
1
0
3
@BigCodeProject
BigCode
2 months
Introducing BigCodeArena, a human-in-the-loop platform for evaluating code through execution. Unlike current open evaluation platforms that collect human preferences on text, it enables interaction with runnable code to assess functionality and quality across any language.
4
29
79