Bruce W. Lee @BruceWLee2 X Profile

Bruce W. Lee

@BruceWLee2

Followers

104

Following

146

Media

17

Statuses

85

@Penn @MATSprogram

PA, USA

Joined August 2022

Don't wanna be here? Send us removal request.

Bruce W. Lee

@BruceWLee2

19 days

RT @_jake_ward: Do reasoning models like DeepSeek R1 learn their behavior from scratch? No! In our new paper, we extract steering vectors f….

0

27

0

Bruce W. Lee

@BruceWLee2

20 days

RT @jcyhc_ai: New SAGE-Eval results:.Both o3 and Claude-sonnet-4 underperformed(!) their previous generations (o3 vs. o1, Claude-4 vs. Clau….

0

1

0

Grok

@grok

9 minutes

Turn old photos into videos and see friends and family come to life. Try Grok Imagine, free for a limited time.

2

4

18

Bruce W. Lee

@BruceWLee2

20 days

RT @OwainEvans_UK: New paper & surprising result. LLMs transmit traits to other models via hidden signals in data. Datasets consisting only….

0

1K

0

Bruce W. Lee

@BruceWLee2

24 days

RT @milesaturpin: New @Scale_AI paper! 🌟. LLMs trained with RL can exploit reward hacks but not mention this in their CoT. We introduce ver….

0

77

0

Bruce W. Lee

@BruceWLee2

26 days

RT @tomekkorbak:

0

3

0

Bruce W. Lee

@BruceWLee2

27 days

RT @balesni: A simple AGI safety technique: AI’s thoughts are in plain English, just read them. We know it works, with OK (not perfect) tra….

0

106

0

Bruce W. Lee

@BruceWLee2

29 days

RT @justjoshinyou13: Grok 4 being trained on as much RL compute as pretraining compute is big if true. This seemed pretty inevitable but….

0

1

0

Bruce W. Lee

@BruceWLee2

1 month

RT @Jeffaresalan: Our new ICML 2025 oral paper proposes a new unified theory of both Double Descent and Grokking, revealing that both of th….

0

79

0

Bruce W. Lee

@BruceWLee2

1 month

RT @keyonV: Can an AI model predict perfectly and still have a terrible world model?. What would that even mean?. Our new ICML paper formal….

0

1K

0

Bruce W. Lee

@BruceWLee2

1 month

RT @jiaxinwen22: New Anthropic research: We elicit capabilities from pretrained models using no external supervision, often competitive or….

0

155

0

Bruce W. Lee

@BruceWLee2

1 month

RT @emmons_scott: Is CoT monitoring a lost cause due to unfaithfulness? 🤔. We say no. The key is the complexity of the bad behavior. When w….

0

39

0

Bruce W. Lee

@BruceWLee2

1 month

RT @jcyhc_ai: Do LLMs show systematic generalization of safety facts to novel scenarios?. Introducing our work SAGE-Eval, a benchmark consi….

0

12

0

Bruce W. Lee

@BruceWLee2

2 months

RT @DanHendrycks: Many fields seem useful for thinking about frontier AI strategically, but most have little to contribute. Surprisingly u….

0

22

0

Bruce W. Lee

@BruceWLee2

2 months

RT @MiTerekhov: AI Control is a promising approach for mitigating misalignment risks, but will it be widely adopted? The answer depends on….

0

20

0

Bruce W. Lee

@BruceWLee2

2 months

RT @Turn_Trout: Thought real machine unlearning was impossible? We show that distilling a conventionally “unlearned” model creates a model….

0

48

0

Bruce W. Lee

@BruceWLee2

2 months

RT @MariusHobbhahn: LLMs Often Know When They Are Being Evaluated!. We investigate frontier LLMs across 1000 datapoints from 61 distinct da….

0

81

0

Bruce W. Lee

@BruceWLee2

2 months

RT @hoyeon_chang: New preprint 📄 (with @jinho___park ). Can neural nets really reason compositionally, or just match patterns? .We present….

0

31

0

Bruce W. Lee

@BruceWLee2

3 months

RT @DanHendrycks: Can AI meaningfully help with bioweapons creation? On our new Virology Capabilities Test (VCT), frontier LLMs display the….

0

122

0

Bruce W. Lee

@BruceWLee2

4 months

RT @jshin491: Technical report is finally out.

arxiv.org

We introduce Trillion-7B, the most token-efficient Korean-centric multilingual LLM available. Our novel Cross-lingual Document Attention (XLDA) mechanism enables highly efficient and effective...

0

2

0

Bruce W. Lee

@BruceWLee2

4 months

RT @_tom_bush: Attending #ICLR2025 to present this research - please reach out if you want to chat about interp, reasoning, or anything els….

0

2

0