TianzheC Profile Banner
Tianzhe Chu Profile
Tianzhe Chu

@TianzheC

Followers
277
Following
255
Media
18
Statuses
80

Now @hkudatascience. Previous @ShanghaiTechUni, visited @UCBerkeley.

Berkeley, CA
Joined September 2022
Don't wanna be here? Send us removal request.
@TianzheC
Tianzhe Chu
6 months
[1/n] 🧐@deepseek_ai #DeepSeekR1 has shown the power of RL without SFT. But what does RL learns differently than SFT?. Our answer is: .📉SFT Memorizes, RL Generalizes.📈.
Tweet media one
tianzhechu.com
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
6
43
178
@TianzheC
Tianzhe Chu
25 days
RT @Kai__He: 🚀 Introducing UniRelight, a general-purpose relighting framework powered by video diffusion models. 🌟UniRelight jointly model….
0
43
0
@TianzheC
Tianzhe Chu
25 days
RT @twominutepapers: NVIDIA’s AI watched 150,000 videos… and learned to relight scenes incredibly well! No game engine. No 3D software. And….
0
13
0
@TianzheC
Tianzhe Chu
3 months
RT @danielyehhh: ❗️❗️ Can MLLMs understand scenes from multiple camera viewpoints — like humans?. 🧭 We introduce All-Angles Bench — 2,100+….
0
27
0
@TianzheC
Tianzhe Chu
3 months
#ICML2025 +1 🤔Seems nobody really cares paper admission except my mom—asked about the status every several weeks.
@TianzheC
Tianzhe Chu
6 months
[1/n] 🧐@deepseek_ai #DeepSeekR1 has shown the power of RL without SFT. But what does RL learns differently than SFT?. Our answer is: .📉SFT Memorizes, RL Generalizes.📈.
4
2
77
@TianzheC
Tianzhe Chu
3 months
RT @druv_pai: I'm at ICLR this week! I'll be presenting ToST, a (provably) computationally efficient high-performance deep architecture der….
0
16
0
@TianzheC
Tianzhe Chu
3 months
Will be at ICLR 2025!.No paper.No plan.With camera. V me $5 you can get an edited portrait plus an ig follower. Tariffed 245% if you pay by Zella
Tweet media one
1
0
28
@TianzheC
Tianzhe Chu
3 months
RT @TongPetersb: We're open-sourcing the training code for MetaMorph!. MetaMorph offers a lightweight framework for turning LLMs into unif….
0
39
0
@TianzheC
Tianzhe Chu
4 months
Won’t use a model that rejected me twice anymore! @AIatMeta .Let’s go Qwen
Tweet media one
0
0
11
@TianzheC
Tianzhe Chu
4 months
RT @TongPetersb: Vision models have been smaller than language models; what if we scale them up?. Introducing Web-SSL: A family of billion-….
0
86
0
@TianzheC
Tianzhe Chu
4 months
RT @DavidJFan: Can visual SSL match CLIP on VQA?. Yes! We show with controlled experiments that visual SSL can be competitive even on OCR/C….
0
95
0
@TianzheC
Tianzhe Chu
4 months
RT @docmilanfar: Statistical "degrees of freedom" (df) is in general not the same as "the # of parameters.". The df for any 1-1 (‘image-to-….
0
61
0
@TianzheC
Tianzhe Chu
4 months
RT @AnthropicAI: New Anthropic research: Tracing the thoughts of a large language model. We built a "microscope" to inspect what happens i….
0
1K
0
@TianzheC
Tianzhe Chu
4 months
RT @kchonyc: it feels like @ylecun is going through his decades-old ideas and re-introducing them one at a time 😂. was the optimal alpha =….
0
43
0
@TianzheC
Tianzhe Chu
4 months
RT @_lewtun: Definitive proof that Google Search is unbiased
Tweet media one
0
11
0
@TianzheC
Tianzhe Chu
4 months
RT @ToruO_O: Sim2Real RL for Vision-Based Dexterous Manipulation on Humanoids. TLDR - we train a humanoid robot wi….
0
65
0
@TianzheC
Tianzhe Chu
4 months
👨‍🍳AI experts in my list, is it reasonable to stop the subscription of Poe (where I mostly use Claude 3.7) and switch to grok to save money?
Tweet media one
1
0
2
@TianzheC
Tianzhe Chu
5 months
Huge congrats to @simon_zhai and long live your trolls!.
@YiMaTweets
Yi Ma
5 months
Just attended the dissertation talk by one of my phd students at Berkeley, Simon Zhai. He is joining Deepmind after graduation. Congratulations!
Tweet media one
0
0
10
@TianzheC
Tianzhe Chu
5 months
RT @alec_helbling: Create heatmaps that localize text concepts in generated videos. We discovered that our approach, ConceptAttention, ca….
0
66
0
@TianzheC
Tianzhe Chu
5 months
RT @YiMaTweets: Our new work ToST is now available as an ICLR'25 spotlight: At least one thing DeepSeek taught us i….
robinwu218.github.io
ToST is a transformer architecture with linear-time attention that is both performant and interpretable, derived from principled compression objectives.
0
45
0