Jaehoon Lee @hoonkp X Profile

Jaehoon Lee

@hoonkp

Followers

1K

Following

222

Media

6

Statuses

244

Researcher in machine learning with background in physics; Member of Technical Staff @AnthropicAI; Prev. Research scientist @GoogleDeepMind/@GoogleBrain.

San Francisco Bay Area, CA

Joined November 2009

Don't wanna be here? Send us removal request.

Jaehoon Lee

@hoonkp

2 months

Claude 4 models are here 🎉 From research to engineering, safety to product - this launch showcases what's possible when the entire Anthropic team comes together. Honored to be part of this journey! Claude has been transforming my daily workflow, hope it does the same for you!.

Anthropic

@AnthropicAI

2 months

Introducing the next generation: Claude Opus 4 and Claude Sonnet 4. Claude Opus 4 is our most powerful model yet, and the world’s best coding model. Claude Sonnet 4 is a significant upgrade from its predecessor, delivering superior coding and reasoning.

0

14

Jaehoon Lee

@hoonkp

2 months

RT @bneyshabur: @ethansdyer and I have started a new team at @AnthropicAI — and we’re hiring!. Our team is organized around the north star….

0

19

0

Jaehoon Lee

@hoonkp

3 months

RT @imkelvinxu: woot woot 🎉

0

5

0

Jaehoon Lee

@hoonkp

1 year

Tour de force led by @_katieeverett investigating the interplay between neural network parameterization and optimizers; the thread/paper includes lot of gems (theory insight, extensive empirics, and cool new tricks)!.

0

12

Jaehoon Lee

@hoonkp

1 year

RT @peterjliu: It was a pleasure working on Gemma 2. The team is relatively small but very capable. Glad to see it get released. On the or….

0

23

0

Jaehoon Lee

@hoonkp

1 year

RT @peterjliu: We recently open-sourced a relatively minimal implementation example of Transformer language model training in JAX, called N….

0

60

0

Jaehoon Lee

@hoonkp

1 year

RT @noahconst: Ever wonder why we don’t train LLMs over highly compressed text? Turns out it’s hard to make it work. Check out our paper fo….

arxiv.org

In this paper, we explore the idea of training large language models (LLMs) over highly compressed text. While standard subword tokenizers compress text by a small factor, neural text compressors...

0

10

0

Jaehoon Lee

@hoonkp

1 year

RT @blester125: Is Kevin onto something? We found that LLMs can struggle to understand compressed text, unless you do some specific tricks.….

0

6

0

Jaehoon Lee

@hoonkp

2 years

Amazing progress made by @thtrieu_ , @Yuhu_ai_! Congratulations!!.

trieu

@thtrieu_

2 years

Proud of this work. Here's my 22min video explanation of the paper:

0

5

Jaehoon Lee

@hoonkp

2 years

This is an awesome opportunity to work with strong collaborators on an impactful science problem! Highly recommended!.

Ekin Dogus Cubuk

@ekindogus

2 years

Materials Discovery team at Google DeepMind is hiring. If interested, please apply via the link below: .

0

1

16

Jaehoon Lee

@hoonkp

2 years

Analyzing training instabilities in Transformers made more accessible by awesome work by @Mitchnw during his internship at @GoogleDeepMind!. We encourage you to think more on understanding the fundamental cause and effect of training instabilities as the models scale up!.

Mitchell Wortsman

@Mitchnw

2 years

Sharing some highlights from our work on small-scale proxies for large-scale Transformer training instabilities: With fantastic collaborators @peterjliu, @Locchiu, @_katieeverett, many others (see final tweet!), @hoonkp, @jmgilmer, @skornblith!. (1/15)

0

4

26

Jaehoon Lee

@hoonkp

3 years

This is amazing opportunity to work on impactful problems in Large Language Models with cool people! Highly recommended!.

Behnam Neyshabur

@bneyshabur

3 years

Interested in Reasoning with Large Language Models?. We are hiring!. Internship:.Full-Time Research Scientist:.Full-Time Research Engineer:. Learn more about Blueshift Team:

1

0

5

Jaehoon Lee

@hoonkp

3 years

RT @ziwphd: Jasper @latentjasper talking about the ongoing journey towards BIG Gaussian processes! A team effort with @hoonkp, Ben Adlam, @….

0

6

0

Jaehoon Lee

@hoonkp

3 years

Today at 11am CT, Hall J #806 we are presenting our paper on infinite width neural network kernels! We have methods to compute NTK/NNGP for extended set of activations + sketched embeddings for efficient approximation (100x) for compute intensive conv kernels! See you there!.

Insu Han

@insu_han

3 years

Most infinitely wide NTK and NNGP kernels are based on the ReLU activation. In we propose a method of computing neural kernels with *general* activations. For homogeneous activations, we approximate the kernel matrices by linear-time sketching algorithms.

0

1

10

Jaehoon Lee

@hoonkp

3 years

RT @jmes_harrison: Tired of tuning your neural network optimizer? Wish there was an optimizer that just worked? We’re excited to release Ve….

0

165

0

Jaehoon Lee

@hoonkp

3 years

Very interesting paper by @jamiesully2, @danintheory and Alex Maloney investigating theoretical origin of neural scaling laws! . Happy to read the 97p paper and learn about new tools in RMT and insights of how statistics of natural datasets are translated into power-law scaling.

Dan Roberts

@danintheory

3 years

New work on the origin of @OpenAI's neural scaling laws w/ Alex Maloney and @jamiesully2: we solve a simplified model of scaling laws to gain insight into how scaling behavior arises and to probe its behavior in regimes where scaling laws break down. 1/.

0

1

9

Jaehoon Lee

@hoonkp

3 years

RT @sama: the deadline for applying to the OpenAI residency is tomorrow. if you are an engineer or researcher from any field who wants to….

0

51

0

Jaehoon Lee

@hoonkp

3 years

RT @lilianweng: 🧮 I finally spent some time learning what exactly Neural Tangent Kernel (NTK) is and went through some mathematical proof.….

lilianweng.github.io

Neural networks are well known to be over-parameterized and can often easily fit data with near-zero training loss with decent generalization performance on test dataset. Although all these paramet...

0

181

0

Jaehoon Lee

@hoonkp

3 years

RT @ethansdyer: 1/ Super excited to introduce #Minerva 🦉(. Minerva was trained on math and science found on the web….

0

523

0

Jaehoon Lee

@hoonkp

3 years

Checkout the sample explorer to checkout the "reasoning" of LLM on various STEM problems:

0

2