michaelrzhang Profile Banner
Michael Zhang Profile
Michael Zhang

@michaelrzhang

Followers
2K
Following
1K
Media
50
Statuses
426

PhD student doing machine learning / neural networks research @UofT @VectorInst. Prev: @UCBerkeley. Journey before destination.

Toronto
Joined August 2017
Don't wanna be here? Send us removal request.
@michaelrzhang
Michael Zhang
19 days
Life update: I've recently moved to Boston and started a job @AmazonScience ! . I'm excited to explore - please share local recs and let me know if you want to grab coffee!. (picture: White Mountains, NH)
Tweet media one
14
2
192
@michaelrzhang
Michael Zhang
1 month
RT @phil_fradkin: The news is out! We're starting Blank Bio to build a computational toolkit assisted with RNA foundation models. If you wa….
0
25
0
@michaelrzhang
Michael Zhang
5 months
RT @alexalbert__: We wrote up what we've learned about using Claude Code internally at Anthropic. Here are the most effective patterns we'….
0
561
0
@michaelrzhang
Michael Zhang
6 months
RT @AndrewYNg: Some people today are discouraging others from learning programming on the grounds AI will automate it. This advice will be….
0
3K
0
@michaelrzhang
Michael Zhang
6 months
RT @emollick: The past 18 months have seen the most rapid change in human written communication ever. By. September 2024, 18% of financial….
0
282
0
@michaelrzhang
Michael Zhang
6 months
RT @Shalev_lif: Hot off the Servers 🔥💻 --- we’ve found a new approach for scaling test-time compute! Multi-Agent Verification (MAV) scales….
0
49
0
@michaelrzhang
Michael Zhang
6 months
RT @alexalbert__: One of the things we've been most impressed by internally at Anthropic is Claude 3.7 Sonnet's one-shot code generation ab….
0
190
0
@michaelrzhang
Michael Zhang
7 months
87 and 97 both getting golden goals is poetic.
0
1
7
@michaelrzhang
Michael Zhang
7 months
RT @JJWatt: It’s just incredible how much of a home run 4 Nations has been for the NHL and hockey in general. Friends who never watched a….
0
4K
0
@michaelrzhang
Michael Zhang
7 months
RT @danbusbridge: Reading "Distilling Knowledge in a Neural Network" left me fascinated and wondering:. "If I want a small, capable model,….
Tweet card summary image
arxiv.org
We propose a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher. Our findings mitigate the risks...
0
149
0
@michaelrzhang
Michael Zhang
7 months
don't need AGI for this one
Tweet media one
0
0
6
@michaelrzhang
Michael Zhang
7 months
RT @IlyaAbyzov: Inspired by @karpathy and the idea of using games to compare LLMs, I've built a version of the game Codenames where differe….
0
226
0
@michaelrzhang
Michael Zhang
7 months
progress on math has been so fast. I remember how impressive Minerva was when it came out. now you need to be good at competitive math to evaluate these latest models.
2
0
8
@michaelrzhang
Michael Zhang
7 months
RT @emollick: Been waiting for someone to test this and see if it really works - can multiple AI agents fact-checking each other reduce hal….
0
250
0
@michaelrzhang
Michael Zhang
7 months
RT @Hesamation: a cool diagram (bottom half) of how deepseek r1's GRPO works from the trl 🤗 library
Tweet media one
0
184
0
@michaelrzhang
Michael Zhang
7 months
RT @karpathy: TinyZero reproduction of R1-Zero."experience the Ahah moment yourself for < $30". Given a base model, the RL finetuning can b….
0
403
0
@michaelrzhang
Michael Zhang
8 months
RT @nsrg_shah: I'm looking for a postdoc to work on algorithmic fairness, AI alignment, cooperative AI, AI safety, and related topics from….
0
26
0
@michaelrzhang
Michael Zhang
8 months
RT @SeunghyunSEO7: The concept of critical batch size is quite simple. Let’s assume we have a training dataset with 1M tokens. If we use a….
0
87
0
@michaelrzhang
Michael Zhang
8 months
RT @karpathy: I still do this most days and I think it works great. My morning brain (right after 1hr exercise and 1 coffee) is quite eager….
0
832
0