Andy Brock @ajmooch X Profile

Andy Brock

@ajmooch

Followers

4K

Following

2K

Media

75

Statuses

585

Dimensionality Diabolist, Seeker of Optima

Joined October 2016

Don't wanna be here? Send us removal request.

Andy Brock

@ajmooch

5 years

Our most recent work on training Normalizer-Free nets! We focus on developing performant architectures which train fast, and show that a simple technique (Adaptive Grad Clipping, or AGC) allows us to train with large batches and heavy augmentations and reach state-of-the-art.

Google DeepMind

@GoogleDeepMind

5 years

Introducing NFNets, a family of image classification models that are: *SOTA on ImageNet (86.5% top-1 w/o extra data) *Up to 8.7x faster to train than EfficientNets to a given accuracy *Normalizer-free (no BatchNorm!) Paper: https://t.co/xvYDkgDCY0 Code: https://t.co/SmKU0gNCy7

6

66

344

Andy Brock

@ajmooch

3 months

Literally resnets dude

Jack Morris

@jxmnop

3 months

very surprising that fifteen years of hardcore computer vision research contributed ~nothing toward AGI except better optimizers we still don't have models that get smarter when we give them eyes

2

0

54

Joost van Amersfoort

@joost_v_amersf

4 months

Never will be.

elie

@eliebakouch

4 months

Pre-training is not dead

5

13

139

Andy Brock

@ajmooch

5 months

Ah yes, the obscure "University of Edinburgh," which famously did not produce this year's physics Nobel laureate

kalomaze

@kalomaze

5 months

another very impressive thing about the VR-CLI paper is that there are only two authors. a phd student + professor from... Scotland, of all places A+ research can come from anywhere

1

0

16

Andy Brock

@ajmooch

7 months

💪

Noam Shazeer

@NoamShazeer

7 months

Introducing Gemini 2.5 Pro Experimental. The 2.5 series marks a significant evolution: Gemini models are now fundamentally thinking models. This means the model reasons before responding, to maximize accuracy -- and it’s our best Gemini model yet. Blog -

2

0

48

Joost van Amersfoort

@joost_v_amersf

8 months

Interested in helping us make Gemini Pro even better? The Gemini pre-training team is looking for a Research Scientist in London to push the boundaries of LLM scaling: understanding, predicting, and improving. ♊️🚀 Apply here:

job-boards.greenhouse.io

Google DeepMind

@GoogleDeepMind

8 months

2.0 Pro Experimental is our best model yet for coding and complex prompts, refined with your feedback. 🤝 It has a better understanding of world-knowledge and comes with our largest context window yet of 2 million tokens - meaning it can analyze large amounts of information.

0

21

63

Andy Brock

@ajmooch

1 year

Open robotaxi door Cupholder already has a delicious milkshake Check the logo It's a Faemo Almost got me! Remember to check the True Name of your ride, every time!

0

7

Andy Brock

@ajmooch

1 year

Oh no, I have learned [DETAIL] about [CRAFT OR TRADE] and now whenever I see [THING] in [MEDIA] I am hopelessly distracted by [DETAIL]

0

5

Andy Brock

@ajmooch

1 year

Counterpoint: JAX is literally the greatest library for neural network research, and it's only the stuff it doesn't do that gets in the way

Ravid Shwartz Ziv

@ziv_ravid

1 year

Someone needs to say it: As someone who started using it in its early days (2019 when I interned at Google) and tries it again occasionally - JAX is terrible! It might be due to my software engineering skills, but it's very hard to debug, and the speed improvement isn't worth it

6

7

113

Andy Brock

@ajmooch

1 year

Hi Matteo, this idea was originally proposed (in a more general form, incorporating an arbitrary number of momentum buffers, albeit with a specific way to pick betas + no alpha) as AggMo, Lucas et al, ICLR 2019: https://t.co/lRp7metDcc; you may wish to consider citing

openreview.net

We introduce a simple variant of momentum optimization which is able to outperform classical momentum, Nesterov, and Adam on deep learning tasks with minimal hyperparameter tuning.

Matteo Pagliardini

@MatPagliardini

1 year

Stop discarding your old gradients! Introducing AdEMAMix, a novel (first-order) optimizer capable of outperforming Adam. Let’s have a thread on momentum and the surprising relevance of very old gradients. A joint work with @GrangierDavid and @PierreAblin #ml #optimization 1/🧵

1

3

45

lmarena.ai

@arena

1 year

Exciting News from Chatbot Arena! @GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive

Logan Kilpatrick

@OfficialLoganK

1 year

Today, we are making an experimental version (0801) of Gemini 1.5 Pro available for early testing and feedback in Google AI Studio and the Gemini API. Try it out and let us know what you think! https://t.co/fBrh6UGcJz

83

402

2K

Andy Brock

@ajmooch

1 year

instead of a laser beam, the Death Star should smash through planets like the Kool-Aid man

0

1

7

Machel Reid

@machelreid

2 years

Super excited for this launch! Had a great time working with a really amazing team getting this out!! Check it out:

Jeff Dean

@JeffDean

2 years

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long

2

7

95

Denis Teplyashin

@TeplyashinDenis

2 years

So excited to see Gemini 1.5 released to the world! It was a blast working with such an amazing team.

Jeff Dean

@JeffDean

2 years

Gemini 1.5 Pro - A highly capable multimodal model with a 10M token context length Today we are releasing the first demonstrations of the capabilities of the Gemini 1.5 series, with the Gemini 1.5 Pro model. One of the key differentiators of this model is its incredibly long

0

2

9

Nikolay Savinov 🇺🇦

@SavinovNikolay

2 years

I was leading long context in Gemini for a while now, and today I’m proud to share what the team has achieved: over 1M context in a large-scale foundation model. Big shoutout to @TeplyashinDenis and @machelreid - without you this would not have happened! https://t.co/iOPPjpxNCH

9

19

231

Joost van Amersfoort

@joost_v_amersf

2 years

10,000,000 tokens achieved internally! 🚢 🚢 🚢

Oriol Vinyals

@OriolVinyalsML

2 years

Gemini 1.5 has arrived. Pro 1.5 with 1M tokens available as an experimental feature via AI Studio and Vertex AI in private preview. Then there’s this: In our research, we tested Gemini 1.5 on up to 2M tokens for audio, 2.8M tokens for video, and 🤯10M 🤯 tokens for text. From

3

2

67

Andy Brock

@ajmooch

2 years

Just watched Saltburn, aka Get In

1

0

3

Andy Brock

@ajmooch

2 years

Things I'm looking forward to this year: -Vending machine you can haggle with -Fridge with twitter (not for you, it's the fridge that tweets) -Doorbell that makes you solve a sphinx riddle to get in -Toaster that screams

0

19

Andy Brock

@ajmooch

2 years

In service of keeping my inner child alive, today I was defeated by a childproof cap.

1

0

10

Joost van Amersfoort

@joost_v_amersf

2 years

Having so much fun working on this every day! 😊 Building the best model with friends and lots of TPUs. 🚀

Demis Hassabis

@demishassabis

2 years

The Gemini era is here. Thrilled to launch Gemini 1.0, our most capable & general AI model. Built to be natively multimodal, it can understand many types of info. Efficient & flexible, it comes in 3 sizes each best-in-class & optimized for different uses https://t.co/VUu1277bC2

1

3

72

rohan anil

@_arohan_

2 years

Gemini Nano improve on the efficiency frontiers. They are multimodal as well, see results in the paper. Nano series: At 1.8B and 3.25B parameters packs so much to provide high utility on device First foundation model on the device! https://t.co/8u4PCkM7Mz

Sundar Pichai

@sundarpichai

2 years

Gemini Nano is super efficient for tasks that are on-device. Android developers can sign up for an early access program for Gemini Nano via Android AICore and Pixel 8 Pro users can already see it rolling out in features like Summarize in Recorder and Smart Reply in Gboard + much

4

12

133