ajmooch Profile Banner
Andy Brock Profile
Andy Brock

@ajmooch

Followers
4K
Following
2K
Media
75
Statuses
585

Dimensionality Diabolist, Seeker of Optima

Joined October 2016
Don't wanna be here? Send us removal request.
@ajmooch
Andy Brock
5 years
Our most recent work on training Normalizer-Free nets! We focus on developing performant architectures which train fast, and show that a simple technique (Adaptive Grad Clipping, or AGC) allows us to train with large batches and heavy augmentations and reach state-of-the-art.
@GoogleDeepMind
Google DeepMind
5 years
Introducing NFNets, a family of image classification models that are: *SOTA on ImageNet (86.5% top-1 w/o extra data) *Up to 8.7x faster to train than EfficientNets to a given accuracy *Normalizer-free (no BatchNorm!) Paper: https://t.co/xvYDkgDCY0 Code: https://t.co/SmKU0gNCy7
6
66
344
@ajmooch
Andy Brock
3 months
Literally resnets dude
@jxmnop
Jack Morris
3 months
very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes
2
0
54
@joost_v_amersf
Joost van Amersfoort
4 months
Never will be.
@eliebakouch
elie
4 months
Pre-training is not dead
5
13
139
@ajmooch
Andy Brock
5 months
Ah yes, the obscure "University of Edinburgh," which famously did not produce this year's physics Nobel laureate
@kalomaze
kalomaze
5 months
another very impressive thing about the VR-CLI paper is that there are only two authors. a phd student + professor from... Scotland, of all places A+ research can come from anywhere
1
0
16
@ajmooch
Andy Brock
7 months
πŸ’ͺ
@NoamShazeer
Noam Shazeer
7 months
Introducing Gemini 2.5 Pro Experimental. The 2.5 series marks a significant evolution: Gemini models are now fundamentally thinking models. This means the model reasons before responding, to maximize accuracy -- and it’s our best Gemini model yet. Blog -
2
0
48
@joost_v_amersf
Joost van Amersfoort
8 months
Interested in helping us make Gemini Pro even better? The Gemini pre-training team is looking for a Research Scientist in London to push the boundaries of LLM scaling: understanding, predicting, and improving. β™ŠοΈπŸš€ Apply here:
job-boards.greenhouse.io
@GoogleDeepMind
Google DeepMind
8 months
2.0 Pro Experimental is our best model yet for coding and complex prompts, refined with your feedback. 🀝 It has a better understanding of world-knowledge and comes with our largest context window yet of 2 million tokens - meaning it can analyze large amounts of information.
0
21
63
@ajmooch
Andy Brock
1 year
Open robotaxi door Cupholder already has a delicious milkshake Check the logo It's a Faemo Almost got me! Remember to check the True Name of your ride, every time!
0
0
7
@ajmooch
Andy Brock
1 year
Oh no, I have learned [DETAIL] about [CRAFT OR TRADE] and now whenever I see [THING] in [MEDIA] I am hopelessly distracted by [DETAIL]
0
0
5
@ajmooch
Andy Brock
1 year
Counterpoint: JAX is literally the greatest library for neural network research, and it's only the stuff it doesn't do that gets in the way
@ziv_ravid
Ravid Shwartz Ziv
1 year
Someone needs to say it: As someone who started using it in its early days (2019 when I interned at Google) and tries it again occasionally - JAX is terrible! It might be due to my software engineering skills, but it's very hard to debug, and the speed improvement isn't worth it
6
7
113
@ajmooch
Andy Brock
1 year
Hi Matteo, this idea was originally proposed (in a more general form, incorporating an arbitrary number of momentum buffers, albeit with a specific way to pick betas + no alpha) as AggMo, Lucas et al, ICLR 2019: https://t.co/lRp7metDcc; you may wish to consider citing
openreview.net
We introduce a simple variant of momentum optimization which is able to outperform classical momentum, Nesterov, and Adam on deep learning tasks with minimal hyperparameter tuning.
@MatPagliardini
Matteo Pagliardini
1 year
Stop discarding your old gradients! Introducing AdEMAMix, a novel (first-order) optimizer capable of outperforming Adam. Let’s have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/🧡
1
3
45
@arena
lmarena.ai
1 year
Exciting News from Chatbot Arena! @GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive
@OfficialLoganK
Logan Kilpatrick
1 year
Today, we are making an experimental version (0801) of Gemini 1.5 Pro available for early testing and feedback in Google AI Studio and the Gemini API. Try it out and let us know what you think! https://t.co/fBrh6UGcJz
83
402
2K
@ajmooch
Andy Brock
1 year
instead of a laser beam, the Death Star should smash through planets like the Kool-Aid man
0
1
7
@machelreid
Machel Reid
2 years
Super excited for this launch! Had a great time working with a really amazing team getting this out!! Check it out:
@JeffDean
Jeff Dean
2 years
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
2
7
95
@TeplyashinDenis
Denis Teplyashin
2 years
So excited to see Gemini 1.5 released to the world! It was a blast working with such an amazing team.
@JeffDean
Jeff Dean
2 years
Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long
0
2
9
@SavinovNikolay
Nikolay Savinov πŸ‡ΊπŸ‡¦
2 years
I was leading long context in Gemini for a while now, and today I’m proud to share what the team has achieved: over 1M context in a large-scale foundation model. Big shoutout to @TeplyashinDenis and @machelreid - without you this would not have happened! https://t.co/iOPPjpxNCH
9
19
231
@joost_v_amersf
Joost van Amersfoort
2 years
10,000,000 tokens achieved internally! 🚒 🚒 🚒
@OriolVinyalsML
Oriol Vinyals
2 years
Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview. Then there’s this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and 🀯10M 🀯 tokens for text. From
3
2
67
@ajmooch
Andy Brock
2 years
Just watched Saltburn, aka Get In
1
0
3
@ajmooch
Andy Brock
2 years
Things I'm looking forward to this year: -Vending machine you can haggle with -Fridge with twitter (not for you, it's the fridge that tweets) -Doorbell that makes you solve a sphinx riddle to get in -Toaster that screams
0
0
19
@ajmooch
Andy Brock
2 years
In service of keeping my inner child alive, today I was defeated by a childproof cap.
1
0
10
@joost_v_amersf
Joost van Amersfoort
2 years
Having so much fun working on this every day! 😊 Building the best model with friends and lots of TPUs. πŸš€
@demishassabis
Demis Hassabis
2 years
The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses https://t.co/VUu1277bC2
1
3
72
@_arohan_
rohan anil
2 years
Gemini Nano improve on the efficiency frontiers. They are multimodal as well, see results in the paper. Nano series: At 1.8B and 3.25B parameters packs so much to provide high utility on device First foundation model on the device! https://t.co/8u4PCkM7Mz
@sundarpichai
Sundar Pichai
2 years
Gemini Nano is super efficient for tasks that are on-device. Android developers can sign up for an early access program for Gemini Nano via Android AICore and Pixel 8 Pro users can already see it rolling out in features like Summarize in Recorder and Smart Reply in Gboard + much
4
12
133