
Andy Brock
@ajmooch
Followers
4K
Following
2K
Media
75
Statuses
585
Dimensionality Diabolist, Seeker of Optima
Joined October 2016
Our most recent work on training Normalizer-Free nets! We focus on developing performant architectures which train fast, and show that a simple technique (Adaptive Grad Clipping, or AGC) allows us to train with large batches and heavy augmentations and reach state-of-the-art.
Introducing NFNets, a family of image classification models that are: *SOTA on ImageNet (86.5% top-1 w/o extra data) *Up to 8.7x faster to train than EfficientNets to a given accuracy *Normalizer-free (no BatchNorm!) Paper: https://t.co/xvYDkgDCY0 Code: https://t.co/SmKU0gNCy7
6
66
344
Never will be.
5
13
139
Ah yes, the obscure "University of Edinburgh," which famously did not produce this year's physics Nobel laureate
another very impressive thing about the VR-CLI paper is that there are only two authors. a phd student + professor from... Scotland, of all places A+ research can come from anywhere
1
0
16
Interested in helping us make Gemini Pro even better? The Gemini pre-training team is looking for a Research Scientist in London to push the boundaries of LLM scaling: understanding, predicting, and improving. βοΈπ Apply here:
job-boards.greenhouse.io
2.0 Pro Experimental is our best model yet for coding and complex prompts, refined with your feedback. π€ It has a better understanding of world-knowledge and comes with our largest context window yet of 2 million tokens - meaning it can analyze large amounts of information.
0
21
63
Open robotaxi door Cupholder already has a delicious milkshake Check the logo It's a Faemo Almost got me! Remember to check the True Name of your ride, every time!
0
0
7
Oh no, I have learned [DETAIL] about [CRAFT OR TRADE] and now whenever I see [THING] in [MEDIA] I am hopelessly distracted by [DETAIL]
0
0
5
Counterpoint: JAX is literally the greatest library for neural network research, and it's only the stuff it doesn't do that gets in the way
Someone needs to say it: As someone who started using it in its early days (2019 when I interned at Google) and tries it again occasionally - JAX is terrible! It might be due to my software engineering skills, but it's very hard to debug, and the speed improvement isn't worth it
6
7
113
Hi Matteo, this idea was originally proposed (in a more general form, incorporating an arbitrary number of momentum buffers, albeit with a specific way to pick betas + no alpha) as AggMo, Lucas et al, ICLR 2019: https://t.co/lRp7metDcc; you may wish to consider citing
openreview.net
We introduce a simple variant of momentum optimization which is able to outperform classical momentum, Nesterov, and Adam on deep learning tasks with minimal hyperparameter tuning.
Stop discarding your old gradients! Introducing AdEMAMix, a novel (first-order) optimizer capable of outperforming Adam. Letβs have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/π§΅
1
3
45
Exciting News from Chatbot Arena! @GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive
Today, we are making an experimental version (0801) of Gemini 1.5 Pro available for early testing and feedback in Google AI Studio and the Gemini API. Try it out and let us know what you think! https://t.co/fBrh6UGcJz
83
402
2K
instead of a laser beam, the Death Star should smash through planets like the Kool-Aid man
0
1
7
Super excited for this launch! Had a great time working with a really amazing team getting this out!! Check it out:
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
2
7
95
So excited to see Gemini 1.5 released to the world! It was a blast working with such an amazing team.
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
0
2
9
I was leading long context in Gemini for a while now, and today Iβm proud to share what the team has achieved: over 1M context in a large-scale foundation model. Big shoutout to @TeplyashinDenis and @machelreid - without you this would not have happened! https://t.co/iOPPjpxNCH
9
19
231
10,000,000 tokens achieved internally! π’ π’ π’
Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview. Then thereβs this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and π€―10M π€― tokens for text. From
3
2
67
Things I'm looking forward to this year: -Vending machine you can haggle with -Fridge with twitter (not for you, it's the fridge that tweets) -Doorbell that makes you solve a sphinx riddle to get in -Toaster that screams
0
0
19
In service of keeping my inner child alive, today I was defeated by a childproof cap.
1
0
10
Having so much fun working on this every day! π Building the best model with friends and lots of TPUs. π
The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses https://t.co/VUu1277bC2
1
3
72
Gemini Nano improve on the efficiency frontiers. They are multimodal as well, see results in the paper. Nano series: At 1.8B and 3.25B parameters packs so much to provide high utility on device First foundation model on the device! https://t.co/8u4PCkM7Mz
Gemini Nano is super efficient for tasks that are on-device. Android developers can sign up for an early access program for Gemini Nano via Android AICore and Pixel 8 Pro users can already see it rolling out in features like Summarize in Recorder and Smart Reply in Gboard + much
4
12
133