Brian Lester @blester125 X Profile

Brian Lester

@blester125

Followers

453

Following

93

Media

12

Statuses

93

Senior Research Engineer at Google Deep Mind working on parameter-efficient adaptation and few-shot generalization, mostly within NLP. View are my own. he/him

https://t.co/Lz0lRrMpyK

Joined July 2013

Don't wanna be here? Send us removal request.

Brian Lester

@blester125

2 years

Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks. Check out https://t.co/DRO2IbTFCg and help @hoonkp, @alemi, Jeffrey Pennington, @ada_rob, @jaschasd, @noahconst and I make Kevin’s dream a reality.

0

6

15

Brian Lester

@blester125

2 years

We just pushed a new update adding support for the (very impressive) safetensors library from our friends at @huggingface! Git-Theta's plug-in system meant that we spent more time waiting on CI/CD than actually adding support (I'll get off my soapbox now 🧼📦).

Brian Lester

@blester125

2 years

Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow! 📄 https://t.co/UejQ1WWg85 💽 https://t.co/ED5K2ZvYA6 🗣️ https://t.co/ehMFk2E5sw 🧵⬇️

0

3

20

Brian Lester

@blester125

2 years

This was joint work will wonderful collaborators: @kandpal_nikhil @Muqeeth10 @anisham197 @montymevans Vishal Baskaran @TenghaoHuang45 @liu_haokun and @colinraffel

3

0

10

Brian Lester

@blester125

2 years

Git-Theta is designed around plug-ins—this means that if we don’t support your favorite framework, merging strategy, or parameter-efficient update yet, you can add it! Join us on GitHub https://t.co/ED5K2ZvYA6 or Zulip https://t.co/ehMFk2E5sw to start contributing!

1

13

Brian Lester

@blester125

2 years

In our ICML paper https://t.co/UejQ1WWg85, we describe the design and implementation of Git-Theta and show that it supports a collaborative workflow involving continually adapting and modifying a pre-trained model, all while saving significant communication and space.

1

0

14

Brian Lester

@blester125

2 years

All of this functionality is integrated with the standard Git workflow—after running git theta track on your model checkpoint, you can git add, branch, merge, and commit as usual! Git-Theta is compatible with any Git remote that supports Git LFS (GitHub, Hugging Face Hub, etc.)

1

0

12

Brian Lester

@blester125

2 years

Git-Theta leverages model checkpoint structure to provide meaningful diffs between model versions. During a `git merge`, Git-Theta offers a suite of interactive merge resolution strategies, such as parameter averaging, that can be applied to individual weights.

1

0

15

Brian Lester

@blester125

2 years

When using Git/Git LFS to track a model checkpoint, any change to any parameter model re-saves the whole checkpoint. Git-Theta supports incremental updates to ML models, either by changing a subset of the parameters or via parameter-efficient updates like LoRA.

1

0

17

Brian Lester

@blester125

2 years

Introducing Git-Theta, a Git extension that enables collaborative and continual development of ML models with merges, diffs, and parameter-efficient updates—all using the standard Git workflow! 📄 https://t.co/UejQ1WWg85 💽 https://t.co/ED5K2ZvYA6 🗣️ https://t.co/ehMFk2E5sw 🧵⬇️

5

83

409

Brian Lester

@blester125

3 years

.@MotiveStudio, I saw @PyTorch in the licenses for @deadspace. Are you using it as a GPU-accelerated linear algebra library or are there actually neural nets running during the game? #deadspace #deadspaceremake

0

Tu Vu

@tuvllms

3 years

While parameter-efficient tuning methods are originally proposed to reduce computation & storage costs, it turns out they can help overcome catastrophic forgetting and thus improve performance on zero-shot cross-lingual generation. Checkout our work @GoogleAI @emnlpmeeting👇1/10

1

30

107

Brian Lester

@blester125

3 years

Am I missing something wrt to the name "gradient checkpointing"? Clearing cached activations and recomputing them in the backwards pass seems like the opposite of checkpointing. The name makes it sound like we are storing the activations on disk. https://t.co/C1nKvpno0B

2

0

1

Daniel Cer

@daniel_m_cer

3 years

We are presenting SPoT: Better Frozen Model Adaption through Soft Prompt Transfer @aclmeeting today during the 2pm in-person ML for NLP poster session and tomorrow at the 7:30am virtual poster session (virtual session w/@tuvuumass). #acl2022 #NLProc #ACLinDublin #acl2022nlp

1

8

Tu Vu

@tuvllms

4 years

Happy to share our soft prompt transfer (SPoT) paper made it to #ACL2022 🎉. On the SuperGLUE leaderboard, SPoT is the first parameter-efficient approach that is competitive with methods that tune billions of parameters. w/ @blester125, @noahconst, @aboSamoor, @daniel_m_cer

Tu Vu

@tuvllms

4 years

Sharing my internship work @GoogleAI: 1) w/ Soft Prompt Transfer, Prompt Tuning matches or significantly outperforms Model Tuning across model sizes, 2) tasks can help each other via their prompts & task prompts can be used as task embeddings to formalize task similarity. 🧵 1/8

2

9

55

Brian Lester

@blester125

4 years

The blog post for my EMNLP 2021 paper on Prompt Tuning is out! Writing for a blog is pretty different than writing for a conference, so if anything was confusing in the paper maybe this will help it click (or you could have just asked me lol)

Google AI

@GoogleAI

4 years

Fine-tuning pre-trained models is common in NLP, but forking the model for each task can be a burden. Prompt tuning adds a small set of learnable vectors to the input and can match fine-tuning quality while sharing the same frozen model across all tasks. https://t.co/NKHhMzk056

0

2

14

Brian Lester

@blester125

4 years

Huge thanks to my collaborators, the people who have put this library through its paces, and the T5X and Flaxformer authors! @noahconst @aboSamoor @tuvuumass @daniel_m_cer @GreenBeanDou @ada_rob @hwchung27 @anselmlevskaya and so many more.

0

7

Brian Lester

@blester125

4 years

It took a bit, but, like the best desserts, it needed to cool before we could bite in. Our code for Prompt Tuning has been open sourced! It enables training with all T5 sizes on TPU, reproducing our results, and is a great starting point for YOUR work. https://t.co/hxqaIOe0oG

3

10

43

Brian Lester

@blester125

4 years

https://t.co/NgbJNVvrKO

blog.google

Qubit the dog, who likes to hang out at Google's Quantum AI campus, waxes philosophical about all things quantum computing. (And treats.)

1

2

3

Brian Lester

@blester125

4 years

In addition to the impressive performance gains, I'm incredibly excited about how this work opens new exploration of targeted transfer learning via prompt similarity. I can't wait to see what gets built on this!

Tu Vu

@tuvllms

4 years

Sharing my internship work @GoogleAI: 1) w/ Soft Prompt Transfer, Prompt Tuning matches or significantly outperforms Model Tuning across model sizes, 2) tasks can help each other via their prompts & task prompts can be used as task embeddings to formalize task similarity. 🧵 1/8

0

1

5

Brian Lester

@blester125

4 years

A huge shout out to my amazing mentors, @noahconst and @aboSamoor, who were a big part of making this project possible. (7/7)

0

5