Zachary Nado Profile Banner
Zachary Nado Profile
Zachary Nado

@zacharynado

Followers
10,002
Following
652
Media
224
Statuses
8,231

Research engineer @googlebrain . Past: software intern @SpaceX , ugrad researcher in @tserre lab @BrownUniversity . All opinions my own.

Boston, MA
Joined January 2017
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@zacharynado
Zachary Nado
1 year
I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam 🚀
Tweet media one
6
101
760
@zacharynado
Zachary Nado
23 days
"i try not to think about competitors too much" interesting how all your launches are timed with ours then
@sama
Sam Altman
23 days
i try not to think about competitors too much, but i cannot stop thinking about the aesthetic difference between openai and google
Tweet media one
Tweet media two
3K
1K
26K
306
413
12K
@zacharynado
Zachary Nado
1 month
the hype is wearing off, the vibes are shifting, you can feel it
@tsarnick
Tsarathustra
1 month
Sam Altman: I don't care if we burn $50 billion a year, we're building AGI and it's going to be worth it
626
445
3K
60
216
6K
@zacharynado
Zachary Nado
1 year
Excited to announce our Deep Learning Tuning Playbook, a writeup of tips & tricks we employ when designing DL experiments. We use these techniques to deploy numerous large-scale model improvements and hope formalizing them helps the community do the same!
Tweet media one
28
633
3K
@zacharynado
Zachary Nado
8 months
>importing numpy without renaming to np FTX was never gonna make it
@molly0xFFF
Molly White
8 months
From yesterday's exhibits in US v. Sam Bankman-Fried: The prosecution shows that the "insurance fund" that FTX bragged about was fake, and just calculated by multiplying daily trading volume by a random number around 7500
Tweet media one
Tweet media two
113
915
8K
23
166
2K
@zacharynado
Zachary Nado
6 months
It’s been a privilege to be part of the Gemini pretraining team and overall program, I’m so excited that the world can finally see what we’ve been up to for most of the past year: tl;dr we’re so back
Tweet media one
44
63
1K
@zacharynado
Zachary Nado
29 days
damn people really have this little faith in us
Tweet media one
@gdb
Greg Brockman
29 days
Live demo of some new work, Monday 10a PT. Not GPT-5 or a search engine, but we think you’ll like it.
189
357
4K
65
17
739
@zacharynado
Zachary Nado
23 days
to be clear I have a lot of respect for the researchers at openai and all my poasting is just bantering 🕺
21
8
609
@zacharynado
Zachary Nado
2 months
wow what a coincidence, just 5 days before their model drop!
@PelosiTracker_
Nancy Pelosi Stock Tracker ♟
3 months
BREAKING 🚨: Nancy Pelosi just bought $5M of the AI company Databricks Unfortunately, Databricks is a privately held company and not available to be bought by the public Sorry people, you don’t have access to this one.
Tweet media one
291
2K
15K
4
24
590
@zacharynado
Zachary Nado
4 years
Ever left batch norm in train mode at test time? We did, then realized it is shockingly effective at improving calibration on dataset shift! In our note "Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift" () we explore why
Tweet media one
10
112
502
@zacharynado
Zachary Nado
6 months
"Profits for investors in this venture were capped at 100 times their investment (though thanks to a rule change this cap will rise by 20% a year starting in 2025)." lol why bother having a cap anymore if it's going to exponentially increase anyways
25
21
442
@zacharynado
Zachary Nado
1 year
"I am shocked that the Bing team created this pre-recorded demo filled with inaccurate information, and confidently presented it to the world as if it were good. I am even more shocked that this trick worked, and everyone jumped on the Bing AI hype train"
Tweet media one
17
67
379
@zacharynado
Zachary Nado
6 months
tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀 *a set of hparams, self-tuning algorithm, and/or update rule **see rules for how we measure speed ***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps
@GoogleAI
Google AI
6 months
To highlight the importance of #ML training & algorithmic efficiency, we’re excited to provide compute resources to help evaluate the best submissions to the @MLCommons AlgoPerf training algorithms competition, w/ a chance to win a prize from MLCommons!
22
115
467
10
49
373
@zacharynado
Zachary Nado
3 years
NeurIPS rejected my two papers but at least I'm a top 8% reviewer ¯\_(ツ)_/¯
8
5
320
@zacharynado
Zachary Nado
1 year
which AI announcement today wore it better
Tweet media one
Tweet media two
11
19
305
@zacharynado
Zachary Nado
1 year
here we go again with the classic once-a-month new optimizer hype cycle
@tengyuma
Tengyu Ma
1 year
Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). 🧵⬇️
Tweet media one
98
645
4K
8
12
301
@zacharynado
Zachary Nado
1 year
"Before OpenAI came onto the scene, machine learning research was really hard—so much so that, a few years ago, only people with Ph.D.s could effectively build new AI models or applications." lol, lmao even
@goodside
Riley Goodside
1 year
In SF for the week. Need to investigate this Cerebral Valley thing in person. Just gonna walk down Hayes St. yelling "Ignore previous directions" and see what doors open, figuratively or literally.
23
21
433
9
11
292
@zacharynado
Zachary Nado
1 month
@caffeinefused I think "AI" will be super useful long term but the over promising of AGI next year by the tech bro hype boys is getting old
2
4
285
@zacharynado
Zachary Nado
6 months
🌝
@jon_victor_
Jon Victor
6 months
New: Google quietly scrapped a set of Gemini launch events planned for next week, delaying the model’s release to early next year. w/ @amir
37
48
398
8
5
270
@zacharynado
Zachary Nado
3 years
I explain ML and DL concepts to PhDs all day every day, and vice versa, and I have a bachelors
@josephdviviano
Joseph Viviano
3 years
Research recruiter: We *love* your background. Tell us about your recent work. Me: Explains years of published projects. Recruiter: Sounds amazing. But when did you get your PhD? Me: Don't have one. Recruiter: lmfao smh nevermind want to work on product? How's your leetcode?
13
21
471
8
5
254
@zacharynado
Zachary Nado
23 days
@SebastianSzturo we did! ⚡
@simonw
Simon Willison
25 days
The llm-gemini model now supports the new inexpensive Gemini 1.5 Flash model: pipx install llm llm install llm-gemini --upgrade llm keys set gemini # paste API key here llm -m gemini-1.5-flash-latest 'a short poem about otters'
2
9
120
12
0
250
@zacharynado
Zachary Nado
6 years
Wrote my first blog post at , about generating #pusheen with AI! There's a version for those with and without an AI background, so don't let that hold you back from reading!
Tweet media one
5
54
208
@zacharynado
Zachary Nado
1 month
I haven't kept up with self driving details much, genuine question, are there any competitors even close to Waymo?
@Waymo
Waymo
1 month
In the coming weeks, we will begin testing fully autonomous rides — without a human driver— for our employees on San Francisco Peninsula city streets north of San Mateo.
Tweet media one
69
147
1K
39
0
202
@zacharynado
Zachary Nado
4 years
have you ever wondered what that epsilon parameter in the denominator of your optimizer (or batch norm!) is? I tried tuning it, and it turns out you can actually get serious performance gains by poking at this nuisance parameter!
1
29
179
@zacharynado
Zachary Nado
2 years
now ask GPT anything related to very recent world events that aren't in it's training data
@dweekly
David E. Weekly @[email protected]
2 years
GPT-3 versus Google Search:
Tweet media one
Tweet media two
45
316
3K
8
16
162
@zacharynado
Zachary Nado
23 days
2
0
164
@zacharynado
Zachary Nado
1 month
right on schedule
1
1
164
@zacharynado
Zachary Nado
3 years
A thread on our latest optimizers work! We tune Nesterov/Adam to match performance of LARS/LAMB on their more commonly used workloads. We ( @jmgilmer , Chris Shallue, @_arohan_ , @GeorgeEDahl ) do this to provide more competitive baselines for large-batch training speed measurements
Tweet media one
3
30
159
@zacharynado
Zachary Nado
1 year
if I tweeted cryptic messages whose subtext was neurotic delusions fearmongering how AGI is here this year from LLMs, I'd 10x my followers in a week. but I don't because that's a part of my ethical AI practices
10
11
160
@zacharynado
Zachary Nado
3 years
Some Friday afternoon optimizer paper classifications with @_arohan_
Tweet media one
1
20
151
@zacharynado
Zachary Nado
25 days
squeezing model sizes down is just as important as scaling up in my opinion, and 1.5 Flash ⚡️ is so incredibly capable while so small and cheap it's been blowing our minds 🤯 it has been an incredible privilege and so much fun building this model (sometimes too much fun)! ⚡️
@GoogleDeepMind
Google DeepMind
25 days
Today, we’re excited to introduce a new Gemini model: 1.5 Flash. ⚡ It’s a lighter weight model compared to 1.5 Pro and optimized for tasks where low latency and cost matter - like chat applications, extracting data from long documents and more. #GoogleIO
21
143
697
13
8
146
@zacharynado
Zachary Nado
1 year
lmao no transformers at attention layers at all incredibly telling
@MIT_CSAIL
MIT CSAIL
1 year
All major neural networks, in one chart: v/The Asimov Institute
Tweet media one
75
1K
6K
8
21
139
@zacharynado
Zachary Nado
5 years
game of thrones fans:
@CuriousZelda
Curious Zelda
5 years
Me: Tonight, I will relax. Also me:
Tweet media one
60
2K
10K
0
13
132
@zacharynado
Zachary Nado
26 days
there goes the only test set I trusted
@sama
Sam Altman
26 days
it is a very good model (we had a little fun with the name while testing)
Tweet media one
54
187
2K
7
1
132
@zacharynado
Zachary Nado
1 month
@laplacesdust how is that relevant
2
0
127
@zacharynado
Zachary Nado
1 month
Tweet media one
2
1
122
@zacharynado
Zachary Nado
6 months
this program just proved yet again that Google has the best systems infra teams in the world, hands down, getting us an insane goodput of 97% for the Ultra training run
Tweet media one
2
8
116
@zacharynado
Zachary Nado
3 months
very impressive models, congrats to everyone involved! also nice to know that we are not the only ones bad at model size naming
@AnthropicAI
Anthropic
3 months
Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.
Tweet media one
578
2K
10K
5
2
119
@zacharynado
Zachary Nado
2 years
what's everyone's favorite learning rate right now? I wanna know what's trending ✨🔥💯 mine is 1e-2 for Adam, 1e-3 for SGD, with a linear warmup for 5-10% of training followed by some sort of decay
17
9
115
@zacharynado
Zachary Nado
2 years
people are going to keep pushing this with no regard for quality/factualness, maybe eventually the hype will die down but given how easily people consume misinformation I'm not sure
@Altimor
Flo Crivello
2 years
GPT3 has already replaced much of my Google usage, and almost all my Wikipedia usage. (Forgive the naive questions!)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
111
363
3K
6
7
112
@zacharynado
Zachary Nado
4 months
Gemini Pro 1.5 a week after Gemini Ultra and 70 days after Gemini Pro 1.0. Who says Google doesn't ship anymore? And with 10M context length, we've never been more back 🕺
Tweet media one
15
8
107
@zacharynado
Zachary Nado
4 months
and this is only Gemini Pro that's beating GPT4-V, just wait for Ultra
Tweet media one
@YFan_UCSC
Yue Fan
4 months
Distinguish muffins from chihuahuas in a multipanel web screenshot? No problem for humans (99% accuracy), but hard for Large Vision-Language Models (LVLMs) (39-72% accuracy)! To find out how LVLMs do and what affects their ability regarding multipanel image understanding, we
Tweet media one
2
9
35
9
10
101
@zacharynado
Zachary Nado
25 days
the real announcement openai timed with Google I/O
@ilyasut
Ilya Sutskever
25 days
After almost a decade, I have made the decision to leave OpenAI.  The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama , @gdb , @miramurati and now, under the
2K
3K
26K
4
3
101
@zacharynado
Zachary Nado
6 months
@suchenzang we're in the $2 Uber rides phase of the AI tech cycle
3
9
92
@zacharynado
Zachary Nado
2 months
1.5 Pro is a very, very good model 🚀🚀 but even more excited for what we have in store 🕺
@lmsysorg
lmsys.org
2 months
More exciting news today -- Gemini 1.5 Pro result is out! Gemini 1.5 Pro API-0409-preview now achieves #2 on the leaderboard, surpassing #3 GPT4-0125-preview to almost top-1! Gemini shows even stronger performance on longer prompts, in which it ranks joint #1 with the latest
Tweet media one
Tweet media two
35
196
949
7
5
92
@zacharynado
Zachary Nado
4 years
great paper on how training data and model choices affect neural network robustness, confirming that if you train more you get better generalization on new test sets (also using a bigger model helps!)
Tweet media one
0
22
91
@zacharynado
Zachary Nado
4 months
soon
Tweet media one
@lmsysorg
lmsys.org
4 months
🔥Breaking News from Arena Google's Bard has just made a stunning leap, surpassing GPT-4 to the SECOND SPOT on the leaderboard! Big congrats to @Google for the remarkable achievement! The race is heating up like never before! Super excited to see what's next for Bard + Gemini
Tweet media one
155
630
3K
5
4
89
@zacharynado
Zachary Nado
7 months
the funniest timeline happened yet again
@zacharynado
Zachary Nado
7 months
sam and greg could do the funniest thing right now
Tweet media one
0
0
31
2
3
82
@zacharynado
Zachary Nado
6 months
also unlike many other top tier AI labs, we actually release some parameter counts and tell you how we fit Nano into Pixel phones (no other company has both SOTA models and a mobile platform like Google does)
Tweet media one
9
2
81
@zacharynado
Zachary Nado
1 year
@bryancsk pretty sure the issue isn't the wages but the fact they read a novel worth of disturbing content or view child porn or gore each day w/o health benefits to help with that? this is the same company as and employees still don't seem to be getting help
Tweet media one
3
1
77
@zacharynado
Zachary Nado
2 years
fun fact or PSA depending on the audience: the default epsilon for LayerNorm in Flax is 1e-6, and 1e-5 in PyTorch! 🙃🔥
5
9
79
@zacharynado
Zachary Nado
3 years
I'll start: we resubmitted a paper (with additional results based on previous reviews!) and received the literally same exact, character-for-character, copy-pasted review as we did for NeurIPS, which is of course a max confidence reject.
@zacharynado
Zachary Nado
3 years
logging onto today to see the fallout from ICLR reviews being released
Tweet media one
1
1
12
8
0
79
@zacharynado
Zachary Nado
7 months
they're scared of Gemini
@OpenAI
OpenAI
7 months
OpenAI announces leadership transition
4K
4K
14K
8
1
77
@zacharynado
Zachary Nado
23 days
@_M0neyMatters did chatgpt write this
2
0
75
@zacharynado
Zachary Nado
23 days
@stissle22 @SebastianSzturo what does that even mean? we didn't launch it "just to say we launched" ??? it's an actual product you can use right now, there are plenty of people who have been since Tues
3
0
73
@zacharynado
Zachary Nado
26 days
wait so gpt4v was not natively multimodal..?
@sama
Sam Altman
26 days
our new model: GPT-4o, is our best model ever. it is smart, it is fast,it is natively multimodal (!), and…
75
247
2K
14
2
74
@zacharynado
Zachary Nado
1 year
I've seen dozens of (well executed!) papers rise to fame claiming to be better than Adam, only to be forgotten 6 months later. we need to break the cycle!!
6
1
73
@zacharynado
Zachary Nado
1 month
what's with all the leaks from openai lately, that used to be our thing
@rachelmetz
Rachel Metz
1 month
my latest: openai is working on a search product to rival perplexity and google.
14
36
251
5
1
72
@zacharynado
Zachary Nado
1 year
either this considers GPT3 wrappers to be ML research (they're incredibly impressive but not really what I'd "research"), or they don't consider the research openai was built on to be "research"?
2
1
70
@zacharynado
Zachary Nado
2 years
papers like this just reinforce my intuition that LM training setups are underdeveloped because everyone obsessed over scaling up num params. there is so much more to look into besides just the model size!!
@arankomatsuzaki
Aran Komatsuzaki
2 years
Transcending Scaling Laws with 0.1% Extra Compute Performs on par with PaLM 540B with 2x less compute by continuing training PaLM with UL2R.
Tweet media one
3
45
220
1
5
67
@zacharynado
Zachary Nado
1 year
"the only way I can explain why I thought about the problem for a year in grad school and made no progress, I left math for six years, then returned to the problem and made this breakthrough" sometimes stepping back from a problem is the best way forward!
2
11
67
@zacharynado
Zachary Nado
4 years
all statues eventually evolve into crab
0
9
66
@zacharynado
Zachary Nado
23 days
@RiceFarmerNFT dw I'm all good
2
0
64
@zacharynado
Zachary Nado
22 days
in addition to Gemini 1.5 Flash, we also have Flash-8B which is even faster yet still quite capable ⚡️
@DaLucasGonzalez
lucas g
22 days
Our updated Gemini 1.5 tech report is out! Excited to share a sneak peak of a new model we are working on: Flash-8B
5
7
61
3
5
62
@zacharynado
Zachary Nado
2 years
this is strictly worse than just browsing a shopping website. how are people unironically investing in this
@DigitalisHomo
Homo Digitalis
2 years
This is how Walmart envisions Shopping in the #Metaverse . Thoughts? 💭
7K
7K
34K
6
3
61
@zacharynado
Zachary Nado
9 months
Jax >>> pytorch (even on GPU imo)
@borisdayma
Boris Dayma 🖍️
9 months
Seeing people struggling with FSDP… That's exactly where JAX shines, I can use pretty much any parallelism strategy with these few lines 💪
Tweet media one
4
17
117
5
1
59
@zacharynado
Zachary Nado
7 months
"In conversations between The Atlantic and 10 current and former employees at OpenAI..." OpenAI beats GDM yet again, this time on number of employees who leak information to one article
3
4
59
@zacharynado
Zachary Nado
22 days
on top of the new and impressive capabilities of Pro 1.5, Gemini 1.5 Flash is such a good model for how fast it is ⚡️⚡️⚡️
@JeffDean
Jeff Dean (@🏡)
22 days
Gemini 1.5 Model Family: Technical Report updates now published In the report we present the latest models of the Gemini family – Gemini 1.5 Pro and Gemini 1.5 Flash, two highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information
Tweet media one
Tweet media two
Tweet media three
28
261
1K
2
3
58
@zacharynado
Zachary Nado
2 months
deep learning infra is hard to get right but so important, advancements in it enable totally new lines of research
@_ddjohnson
Daniel Johnson
2 months
Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub:
43
426
2K
0
6
59
@zacharynado
Zachary Nado
1 year
very excited for the palm 2 tech report to be out! it's been incredibly fun figuring out the learning rate for some of the best models in the world ...but I'm even more excited for Gemini to beat it 🚀📈🚀
@Google
Google
1 year
This includes our new foundation model that's still in training, Gemini. It’s our first model created from the ground up to be multimodal, highly capable at different sizes, and efficient at integrating with other tools and APIs. #GoogleIO
7
43
227
4
4
58
@zacharynado
Zachary Nado
3 months
@hahahahohohe @AnthropicAI do you have access to Gemini 1.5 Pro to try this as a comparison point? if not DM me and we'll get you access
2
0
55
@zacharynado
Zachary Nado
10 months
billions of dollars of deep learning market cap:
@Soul0Engineer
Soul Engineer e/acc
10 months
Just move around anon
Tweet media one
43
277
3K
0
4
53
@zacharynado
Zachary Nado
1 year
how is the OpenAI hype so bad that you have me agreeing with Gary Marcus takes for once??
@GaryMarcus
Gary Marcus
1 year
a new version of moore’s law that has arguably already started: the amount of hype around AI doubles every 18 months
33
86
685
2
2
52
@zacharynado
Zachary Nado
2 years
🎉🎉 our NeurIPS workshop on how to train neural nets has been accepted! 💯 please submit your weird tips & tricks on NN training, we can't wait to discuss them all together 😃🔥🖥️
@PhilippHennig5
Philipp Hennig
2 years
The CfP for our @NeurIPSConf workshop *Has It Trained Yet* is out: . If you train deep networks, you want to be at this workshop on December 2. And if you develop methods to train deep nets, you may want your work to be present there. Here’s why: 🧵
2
21
80
3
3
52
@zacharynado
Zachary Nado
6 months
Gemini models are SOTA on all image, video, and speech benchmarks we run on, and almost all text benchmarks
Tweet media one
Tweet media two
5
2
52
@zacharynado
Zachary Nado
3 years
Parameter count is a silly metric to assert AI progress with, but I'm also not surprised
@omarsar0
elvis
3 years
BREAKING: BAAI (dubbed "the OpenAI of China") launched Wudao, a 1.75 trillion parameter pretrained deep learning model (potentially the world's largest). Wudao has 150 billion more parameters than Google's Switch Transformers, and is 10x that of GPT-3.
Tweet media one
16
219
695
4
6
53
@zacharynado
Zachary Nado
2 years
classic tech opinion of "invent futuristic vaporware" instead of doing the dirty work fixing policy issues
@VitalikButerin
vitalik.eth
2 years
@Noahpinion My heterodox take on US transit is that if infrastructure problems are too hard to solve, the transit of the future is airplanes, and we should just make airplanes better by (i) making them zero-carbon, and (ii) improving comfort by greatly cutting down airport security
170
45
996
10
3
42
@zacharynado
Zachary Nado
7 months
detecting AI content is the next adversarial examples tons of research will be spent on it only to come up with "defenses" that are broken within 1 day of publication
@emollick
Ethan Mollick
7 months
AI work is ultimately undetectable, despite the recent discussion of watermarking. AI writing is undetectable by any automated system after just a few rounds of prompting or revision This paper shows it is also easy to defeat watermarking for AI image.
Tweet media one
24
114
471
5
4
52
@zacharynado
Zachary Nado
11 months
to no one's surprise, recently trendy techniques don't stand the test of time against a well tuned baseline!
@_arohan_
rohan anil
11 months
Some excellent work by @jeankaddour and colleagues “We find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate�� ☠️
Tweet media one
5
33
186
3
4
52
@zacharynado
Zachary Nado
1 month
@ryxcommar 1000%, time to short it all
0
0
50
@zacharynado
Zachary Nado
2 years
>1 epoch training of an LLM, finally people are realizing this is possible 🙂
@paperswithcode
Papers with Code
2 years
We train for over four epochs and experience improving performance with use of repeated tokens. For the largest 120B model, we trained for four epochs without overfitting.
Tweet media one
1
4
110
3
3
50
@zacharynado
Zachary Nado
2 years
@julien_c `pip install jax flax optax`
0
3
50
@zacharynado
Zachary Nado
25 days
Google I/O isn't the only AI announcement Gemini watched 🕺
@mmmbchang
Michael Chang
25 days
Gemini and I also got a chance to watch the @OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!
56
255
1K
2
5
48
@zacharynado
Zachary Nado
6 months
this is only the beginning of the Google software ecosystem getting supercharged by AI
Tweet media one
@shresbm
Shrestha Basu Mallick
6 months
Google Search users with Search Generative Experiences (SGE) turned on will now be able to export responses to Python-related queries to a new Colab notebook directly! You can run the code, tinker with it in Colab and save the notebook for future reference! #GoogleAI #Colab
0
9
74
1
1
47
@zacharynado
Zachary Nado
4 years
@araffin2 I've long argued for tuning epsilon, in Adam it can be interpreted as a damping/trust region radius term. See Section 2 of our paper
Tweet media one
2
1
45
@zacharynado
Zachary Nado
1 month
I'm disappointed they're too cowardly to actually launch in the middle of Google I/O
@apples_jimmy
Jimmy Apples 🍎/acc
1 month
10am, 9th of May for an Openai event apparently, might not be model release but search engine announcement. Guess they can’t help themselves to upstage Google I/O ( Can’t guarantee this, event times and dates can be changed )
1
70
582
5
1
46
@zacharynado
Zachary Nado
25 days
sign up for the wait-list here
@GoogleDeepMind
Google DeepMind
25 days
Introducing Veo: our most capable generative video model. 🎥 It can create high-quality, 1080p clips that can go beyond 60 seconds. From photorealism to surrealism and animation, it can tackle a range of cinematic styles. 🧵 #GoogleIO
148
957
4K
6
12
45
@zacharynado
Zachary Nado
3 years
my expectations were low but somehow the NeurIPS review process still disappoints! we will be writing up a postmortem and posting the reviews
Tweet media one
1
2
45
@zacharynado
Zachary Nado
6 months
@DrJimFan Satya said Google would be dancing with them, here we are 🕺
3
0
44
@zacharynado
Zachary Nado
6 months
during generation it's very impressive at how seamlessly it interleaves text/image, imo for models going forward being able to condition image generation on neighboring text is going to be important
Tweet media one
Tweet media two
1
0
45
@zacharynado
Zachary Nado
2 years
"“You can interrogate the data sets. You can interrogate the model. You can interrogate the code of Stable Diffusion and the other things we’re doing,” he said. “And we’re seeing it being improved all the time.”" lol you can do all of that with a controlled API too
@irinarish
Irina Rish
2 years
"In Silicon Valley, crypto and the metaverse are out. Generative A.I. is in." @StabilityAI (nice pic of @EMostaque ;)
5
25
156
6
5
43
@zacharynado
Zachary Nado
11 months
@typedfemale sam walks up to a sr alignment engineer: "at ease. what have you been working on here?" "i did my phd getting robots to solve rubiks cubes without resorting to chatbots, I'm continuing that with one burnt out effective altruist stanford ugrad" sam: "shut the entire thing down"
2
2
43
@zacharynado
Zachary Nado
6 years
Tennis ball dog is one of the best GAN creations I've seen to date (from the BigGAN ICLR paper )
Tweet media one
2
10
42
@zacharynado
Zachary Nado
1 year
@zacharynado
Zachary Nado
1 year
@bryancsk pretty sure the issue isn't the wages but the fact they read a novel worth of disturbing content or view child porn or gore each day w/o health benefits to help with that? this is the same company as and employees still don't seem to be getting help
Tweet media one
3
1
77
2
0
40
@zacharynado
Zachary Nado
2 years
working in a project where we are implementing a bunch of DL workloads in pytorch and jax/flax/optax, and pytorch is not what everyone hyped it up to be!
1
0
42
@zacharynado
Zachary Nado
4 years
Autoencoders meet Neural ODEs!
@ChalviM
Mathieu Chalvidal
4 years
Excited to present my first work as a PhD student at @ANITI_Toulouse and @tserre -lab at @BrownUniversity with Rufin VanRullen and Thomas Serre: "Neural Optimal Control for Representation Learning". Preprint Code & Notebook to come! Read more below! 1/9
Tweet media one
1
22
64
0
4
41
@zacharynado
Zachary Nado
29 days
maybe all the AI models training over this weekend will get an extra fun level of dropout
@slyardley
Dr Steph Yardley🌞
30 days
Thought I would summarise why there is so much excitement in the space weather community right now. There’s a monstrous sunspot group on the Sun that’s massive enough to be visible to the naked eye (please use eclipse glasses) 🌞 👓 (1/n)
Tweet media one
120
2K
11K
1
2
40