Shengjia Zhao Profile
Shengjia Zhao

@shengjia_zhao

Followers
5,416
Following
226
Media
12
Statuses
255

Research Scientist @ OpenAI. Formerly PhD @ Stanford. I like training models. All opinions my own.

Joined November 2016
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@shengjia_zhao
Shengjia Zhao
7 months
OpenAI is nothing without its people
19
43
715
@shengjia_zhao
Shengjia Zhao
7 months
We must be living in a simulation that glitched. The madness is unbelievable
24
5
324
@shengjia_zhao
Shengjia Zhao
5 years
Found this amazing and comprehensive summary of variance reduction methods (chapter 8,9,10). Also happy to share the cheetsheet I made comparing several major methods (not guaranteed correct).
Tweet media one
5
91
331
@shengjia_zhao
Shengjia Zhao
7 months
It's amazing what a small team with limited GPUs can do in such a short amount of time. Congrats!
@pika_labs
Pika
7 months
Introducing Pika 1.0, the idea-to-video platform that brings your creativity to life. Create and edit your videos with AI. Rolling out to new users on web and discord, starting today. Sign up at
1K
5K
26K
7
15
276
@shengjia_zhao
Shengjia Zhao
7 months
❤️
@sama
Sam Altman
7 months
i love the openai team so much
5K
4K
73K
5
9
136
@shengjia_zhao
Shengjia Zhao
3 years
🚨[New Paper] Should you trust calibrated predictions for high-stakes decisions? Check out to see what calibration actually means for decision-making, and a new decision-tailored calibration notion. With @mikekimbackward , Roshni, @tengyuma , @StefanoErmon
Tweet media one
2
19
122
@shengjia_zhao
Shengjia Zhao
7 months
Breaking news: the board asked my dog to be the inter-rim CEO and it declined.
Tweet media one
0
1
98
@shengjia_zhao
Shengjia Zhao
7 months
What a ride! Time to get back to building cool things
@OpenAI
OpenAI
7 months
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo. We are collaborating to figure out the details. Thank you so much for your patience through this.
6K
13K
67K
3
3
99
@shengjia_zhao
Shengjia Zhao
7 months
❤️
@bradlightcap
Brad Lightcap
7 months
OpenAI is nothing without its people
200
249
4K
1
1
82
@shengjia_zhao
Shengjia Zhao
7 months
@gdb
Greg Brockman
7 months
We are going to build something new & it will be incredible. Initial leadership (more soon): @merettm @sidorszymon @aleks_madry @sama @gdb The mission continues.
770
2K
23K
1
1
61
@shengjia_zhao
Shengjia Zhao
4 months
Congrats to the team! This is truly on the next level.
@OpenAI
OpenAI
4 months
Introducing Sora, our text-to-video model. Sora can create videos of up to 60 seconds featuring highly detailed scenes, complex camera motion, and multiple characters with vibrant emotions. Prompt: “Beautiful, snowy
10K
33K
140K
1
1
58
@shengjia_zhao
Shengjia Zhao
3 years
How can you guarantee the correctness of each individual prediction? New work with @StefanoErmon (AISTATS'21 oral) provides a new perspective on this age-old dilemma based on ideas like insurance and game theory. Blog: Arxiv:
Tweet media one
0
10
51
@shengjia_zhao
Shengjia Zhao
7 months
Tweet media one
4
0
48
@shengjia_zhao
Shengjia Zhao
6 months
Time to become a believer in small models again
3
1
38
@shengjia_zhao
Shengjia Zhao
4 years
Individually calibrated forecaster is (almost) impossible to learn from finite datasets, but it's possible if we learn randomized forecasters. We show how to do this in our ICML'2020 paper with @tengyuma @StefanoErmon .
Tweet media one
0
8
38
@shengjia_zhao
Shengjia Zhao
7 months
@xf1280 Kind of difficult to sleep on such a day no?
2
0
25
@shengjia_zhao
Shengjia Zhao
6 years
A mixture of Gaussian can achieve much better log likelihood than GAN. . @adityagrover_ Maybe log likelihood need modification in the context of GANs? It is interesting to find something gentler on disjoint supports and easy to estimate.
@RogerGrosse
Roger Grosse
6 years
New evaluation metrics are great, but please, please also measure the metrics we already understand, like test log-likelihood!
1
15
68
1
7
27
@shengjia_zhao
Shengjia Zhao
7 months
@williamlegate Yes! This is what people did not dare to say in public
1
2
24
@shengjia_zhao
Shengjia Zhao
3 years
🚨 If a provider predicts that a vaccine is 95% effective for you, should you trust the 95% when making decisions? We show how to enable decisions with complete confidence. AISTATS oral happening in 1 hour Blog
Tweet media one
2
6
25
@shengjia_zhao
Shengjia Zhao
7 months
@DrJimFan LLMs still feel magical so maybe this time it’s different
4
0
21
@shengjia_zhao
Shengjia Zhao
6 months
lol this is how future SOTAs need to be presented
@noway421
Ilia Sidorenko
6 months
@abacaj marketing suggestion:
Tweet media one
14
30
480
0
1
22
@shengjia_zhao
Shengjia Zhao
5 years
Fixed some typos
Tweet media one
0
5
20
@shengjia_zhao
Shengjia Zhao
6 years
Really like this paper. Optimizing parameters over a random subspace as a measurement of effective complexity of a task w.r.t. a network architecture. A simple strategy but very effective and lots of good insights! via @YouTube
0
8
22
@shengjia_zhao
Shengjia Zhao
7 months
A cat that's progressively more fat
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
18
@shengjia_zhao
Shengjia Zhao
7 months
Everything is fine!
@OpenAI
OpenAI
7 months
ChatGPT with voice is now available to all free users. Download the app on your phone and tap the headphones icon to start a conversation. Sound on 🔊
2K
3K
17K
0
0
18
@shengjia_zhao
Shengjia Zhao
7 months
lol that is so funny
@_smileyball
Rui Shu
7 months
OpenAI is nothing without its bobas and chicken nuggets
3
4
70
0
0
18
@shengjia_zhao
Shengjia Zhao
6 years
Getting reassured that feeling stupid is at least a good sign XD
@DrPaulDWilliams
Prof Paul Williams
6 years
The importance of stupidity in scientific research! I show this brilliant essay to all my new PhD students. It contains some excellent advice on how to handle – and even learn to love – the feeling of being constantly immersed in the unknown.
Tweet media one
22
989
2K
0
2
16
@shengjia_zhao
Shengjia Zhao
6 years
Really nice summary of the huge number (20?) of methods to evaluate GANs. Free lunch theorem for GANs: for any new GAN model there is some metric it improves.
0
4
17
@shengjia_zhao
Shengjia Zhao
7 months
Yes
@emollick
Ethan Mollick
7 months
I have the growing feeling that the entire OpenAI debacle is going to be much less about four dimensional strategic chess and more about human mistakes, errors, confusion & conflicting motives. As it almost always is.
46
68
947
0
0
14
@shengjia_zhao
Shengjia Zhao
6 years
Found this excellent paper on algorithmic information theory. As light a read as ever gets for this topic. Really like the conceptual satisfaction of a beautiful theory defining fundamental meaning of structure and intelligence. (Despite incomputable)
0
5
15
@shengjia_zhao
Shengjia Zhao
6 years
Wow this is like finding a treasure box. Amazed at the number of great lectures to be found no where else
3
1
15
@shengjia_zhao
Shengjia Zhao
11 months
@MIT_CSAIL @zhaofeng_wu This feels more like an alignment problem than a lack of underlying abilities. It's like showing a person a new language and asking them to immediately do math in it. What if you finetune the model on 1M tokens of the new language?
4
0
12
@shengjia_zhao
Shengjia Zhao
6 years
The best explanation and summary of normalizing flows I have seen. Highly recommend.
0
3
12
@shengjia_zhao
Shengjia Zhao
5 years
Feels like I just bought tickets for a theme park. There are sooo many pricing options and discount packages, very confusing! What is the best deal you got? #AAAI2019
Tweet media one
Tweet media two
1
0
10
@shengjia_zhao
Shengjia Zhao
11 months
Evaluation will become the hardest part of LLMs :)
@douwekiela
Douwe Kiela
11 months
Progress in AI continues to outpace benchmarks. Check out this new plot, inspired by @DynabenchAI , that shows just how quickly it's happening. Read more about it here:
Tweet media one
6
28
115
1
0
10
@shengjia_zhao
Shengjia Zhao
6 years
Really good summary of the contraversy surrounding the nature of probability/randomness Interesting because the goal of ML is to ``model structure instead of randomness''. But whatever that means is still up to debate. (esp. unsupervised learning)
0
4
10
@shengjia_zhao
Shengjia Zhao
7 years
Only reading the first couple chapters changed fundamentally how I view prob and stats. Highly recommended
1
4
9
@shengjia_zhao
Shengjia Zhao
6 years
Very enjoyable introduction
@brandondamos
Brandon Amos
6 years
Attention Solves Your TSP by W. Kool and M. Welling This paper has a Kool intro. Paper: @PyTorch Code:
Tweet media one
2
80
317
0
2
8
@shengjia_zhao
Shengjia Zhao
1 year
GPT-4 has finally landed!
@OpenAI
OpenAI
1 year
Announcing GPT-4, a large multimodal model, with our best-ever results on capabilities and alignment:
2K
17K
64K
2
0
8
@shengjia_zhao
Shengjia Zhao
7 months
@sama 🥳🥳🥳
0
0
6
@shengjia_zhao
Shengjia Zhao
6 years
I wonder what are the applications of these classical papers in light of modern generative models, which handle complex inputs, but do not assume much more structure other than vector space in latent space.
0
2
5
@shengjia_zhao
Shengjia Zhao
7 years
@TheShubhanshu @ermonste Thanks for the question! MMD does not require reparameterization because it is likelihood free optimization, so q(z|x) need not be a distribution(But it can be).
1
1
5
@shengjia_zhao
Shengjia Zhao
7 months
@_jasonwei @hwchung27 Not enough independents
0
0
5
@shengjia_zhao
Shengjia Zhao
6 years
oops wrong address
0
0
5
@shengjia_zhao
Shengjia Zhao
11 months
I'm quite sure that GPT-4 has not degraded over time in the cognitive evals (exams, MMLU, etc). The problem is that they only test prime numbers and not composite numbers. The Mar model always says prime, and the June model always says not prime, but you would get 100% on their
@DrJimFan
Jim Fan
11 months
Many of us practitioners have felt that GPT-4 degrades over time. It's now corroborated by a recent study. But why does GPT-4 degrade, and what can we learn from it? Here're my thoughts: ▸ Safety vs helpfulness tradeoff: the paper shows that GPT-4 Jun version is "safer" than
Tweet media one
114
430
2K
0
0
4
@shengjia_zhao
Shengjia Zhao
7 months
And that is how the universe was born from a bowl of spicy ramen.
@venturetwins
Justine Moore
7 months
Obsessed with the new “make it more” trend on ChatGPT. You generate an image of something, and then keep asking for it to be MORE. For example - spicy ramen getting progressively spicier 🔥 (from u/dulipat)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
441
3K
29K
0
0
4
@shengjia_zhao
Shengjia Zhao
1 year
Amazing work!
@realDanFu
Dan Fu
1 year
This sentiment is exactly right - and why we've been working to increase sequence length in our lab for the past two years! From FlashAttention, to S4, H3, Hyena, and more - check out our blog post putting this line of work into context: More below: 1/n
4
41
241
0
0
3
@shengjia_zhao
Shengjia Zhao
6 years
Very nice summary of problems with evaluating generative models
@goodfellow_ian
Ian Goodfellow
6 years
Thread on how to review papers about generic improvements to GANs
9
226
642
0
0
2
@shengjia_zhao
Shengjia Zhao
7 years
By no-free-lunch all models rely on inductive bias. Current bias of DL are good for some tasks only. Doesn't seem like a fault of DL though.
@fchollet
François Chollet
7 years
My quick write-up on the limitations of deep learning: It's meant as an intro to tomorrow's post on the future of DL
19
308
691
0
0
2
@shengjia_zhao
Shengjia Zhao
11 months
0
0
2
@shengjia_zhao
Shengjia Zhao
6 years
@poolio Thanks for pointing that out! I guess they have a somewhat different motivation but derived the same model. It would be nice if both papers can mention each other.
0
1
2
@shengjia_zhao
Shengjia Zhao
7 years
@TheShubhanshu @ermonste @PyTorch Thanks! Added a link to your repo in our post if that is okay. Updated version should appear soon.
1
0
2
@shengjia_zhao
Shengjia Zhao
3 years
Congrats Aditya!
@adityagrover_
Aditya Grover
3 years
Thrilled to share that my PhD dissertation won the ACM SIGKDD Dissertation Award for "outstanding work in data science and machine learning". Thanks to everyone involved, especially my advisor @StefanoErmon & @StanfordAILab !
56
44
797
1
0
2
@shengjia_zhao
Shengjia Zhao
7 years
Sounds like a probably approximately optimal strategy!
@volokuleshov
Volodymyr Kuleshov 🇺🇦
7 years
ICLR18 is going to be in Vancouver next year and will now have double blind open reviews (great idea!). CfP here:
0
0
8
0
0
2
@shengjia_zhao
Shengjia Zhao
7 months
@rabrg This is the best place to digest all that's happened
0
0
1
@shengjia_zhao
Shengjia Zhao
5 years
@evrenguney I guess I would consider it stratification. Each dimension is partitioned into bins, and each bin contain a fixed number of samples.
0
0
1
@shengjia_zhao
Shengjia Zhao
7 years
@arishabh8 @ermonste Thanks! There is a good exposition in Gretton et al. 2007 referenced in our blog.
0
0
1
@shengjia_zhao
Shengjia Zhao
7 years
@baaadas Congrats!
0
0
1
@shengjia_zhao
Shengjia Zhao
7 years
Maybe this is why GANs work since it is virtually impossible to model the distribution of natural image. E.g. Chaotic patterns.
@pfau
David Pfau
7 years
Dunno how many times I can say this - GANs aren't really learning the distribution of data - just making pretty pictures.
7
32
92
0
0
1
@shengjia_zhao
Shengjia Zhao
7 months
Amazing! Congrats on the new adventure!
@willieneis
Willie Neiswanger
7 months
Excited to share that I will join @USC as an Asst. Professor of Computer Science in Jan 2024—and I’m recruiting students for my new lab! 📣 Come work at the intersection of machine learning, decision making, generative AI, and AI-for-science. More info:
Tweet media one
Tweet media two
Tweet media three
27
33
391
0
0
1