
Kaizhao Liang
@KyleLiang5
Followers
620
Following
10K
Media
301
Statuses
4K
Class of 2020 @IllinoisCDS 5 years at @SambaNovaAI Grad student @UTCompSci since 2023 Interested in new optimizers and neural architectures
Austin, Texas
Joined December 2018
TLDR: 1⃣ line modification, satisfaction (theoretically and empirically) guaranteed 😀😀😀.Core idea: 🚨Do not update if you are not sure.👨💻🤗📚 @cranialxix @lqiang67 @Tim38463182
12
37
258
RT @nateliason: Waymo is a godsend for working parents. Need to hand off the baby between meetings?. Put them in a Waymo and send them to….
0
1K
0
Just realize cautious is basically a hard version of hypergradient lr scheduling 🤨.Instead of last batch of gradient, it uses momentum 😛.
Fun fact: When you're using a good learning rate, the gradient should be almost *perpendicular* to direction of the last step (opposite of the intuition of many gradient descent diagrams that make it look like the gradient is following a smooth path). You can derive this by
0
0
3
RT @rosmine_b: Fun fact: When you're using a good learning rate, the gradient should be almost *perpendicular* to direction of the last ste….
0
9
0
Knew this would be coming in some shape or form….
Some news: We're building the next big thing — the first-ever AI-only social video app, built on a highly expressive human video model. Over the past few weeks, we’ve been testing it in private beta. Now, we’re opening early access: download the iOS app to join the waitlist, or.
0
0
1
RT @itsPaulAi: Wait NVIDIA has just released new SOTA open source models?!. Available in 4 sizes 1.5B, 7B, 14B and 32B that you can run 100….
0
168
0
RT @eliebakouch: We've just release 100+ intermediate checkpoints and our training logs from SmolLM3-3B training. We hope this can be use….
0
59
0
RT @MaziyarPanahi: Perfect Sunday: I just used Kimi-K2 by @Kimi_Moonshot to vibe code a @Gradio app! 🔥 . You can use "Anycoder" Space by @_….
0
17
0
RT @LakerNewhouse: [1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stabil….
0
78
0
RT @Gradio: Mixture of Experts? No no, Roundtable of Experts! . Introducing Consilium - When AI Models Debate Around a Table!. Results are….
0
15
0
RT @VentureBeat: Meet AnyCoder, a new Kimi K2-powered tool for fast prototyping and deploying web apps
venturebeat.com
For novice developers or even those with expertise who want to spin up a new project fast, AnyCoder seems like a great place to start.
0
14
0
RT @_akhaliq: You can install anycoder as a Progressive Web App on your device. Visit and in the footer click set….
0
11
0
In hindsight, would have been cooked if they did not exit in time. .
character.ai
Chat with millions of AI Characters on the #1 AI chat app. Where will your next adventure take you?
0
0
1
I am showing this to everyone that says #AI is a bubble 🫧.Also is this what 200K GPUs can enable?🧐🧐🧐
0
0
0