Yizhou Liu @YizhouLiu0 X Profile

Yizhou Liu

@YizhouLiu0

Followers

401

Following

162

Media

18

Statuses

72

PhD student at @MITMechE | Physics of living systems, Complex systems, Statistical physics

Cambridge, MA

Joined October 2022

Don't wanna be here? Send us removal request.

Yizhou Liu

@YizhouLiu0

3 months

Superposition means that models represent more features than dimensions they have, which is true for LLMs since there are too many things to represent in language. We find that superposition leads to a power-law loss with width, leading to the observed neural scaling law. (1/n)

4

73

540

Yizhou Liu

@YizhouLiu0

6 days

RT @PRX_Life: Despite classical sign rules saying that noise correlations hurt coding, they can help when correlations are strong and fine….

0

3

0

Yizhou Liu

@YizhouLiu0

19 days

RT @AnthropicAI: New Anthropic research: Persona vectors. Language models sometimes go haywire and slip into weird and unsettling personas….

0

935

0

Yizhou Liu

@YizhouLiu0

27 days

RT @PNASNews: A trending PNAS article in the last week is “Optimistic people are all alike: Shared neural representations supporting episod….

0

5

0

Yizhou Liu

@YizhouLiu0

1 month

RT @fchollet: Today we're releasing a developer preview of our next-gen benchmark, ARC-AGI-3. The goal of this preview, leading up to the….

0

818

0

Yizhou Liu

@YizhouLiu0

3 months

RT @iScienceLuvr: How much do language models memorize?. "We formally separate memorization into two components: unintended memorization, t….

0

174

0

Yizhou Liu

@YizhouLiu0

3 months

RT @AnthropicAI: Our interpretability team recently released research that traced the thoughts of a large language model. Now we’re open-s….

0

582

0

Yizhou Liu

@YizhouLiu0

3 months

RT @AllysonSgro: Are you a student or postdoc working on theory for biological problems? Just over two weeks left to apply for our fall wor….

janelia.org

Attendees will have the opportunity to present as well as learn from one another. They will give 20-minute talks on their own research questions, as well as in-depth 45-minute whiteboard tutorials on

0

15

0

Yizhou Liu

@YizhouLiu0

3 months

RT @catherineliangq: Curious why disentangled representation is insufficient for compositional generalization?🧐 Our new ICML study reveals….

0

2

0

Yizhou Liu

@YizhouLiu0

3 months

RT @weijie444: I just wrote a position paper on the relation between statistics and large language models:. Do Large Language Models (Reall….

arxiv.org

Large language models (LLMs) represent a new paradigm for processing unstructured data, with applications across an unprecedented range of domains. In this paper, we address, through two...

0

53

0

Yizhou Liu

@YizhouLiu0

3 months

RT @_AndrewZhao: LLMs are Headless Chickens

0

40

0

Yizhou Liu

@YizhouLiu0

3 months

RT @robert_csordas: Your language model is wasting half of its layers to just refine probability distributions rather than doing interestin….

0

139

0

Yizhou Liu

@YizhouLiu0

3 months

RT @xuandongzhao: 🚀 Excited to share the most inspiring work I’ve been part of this year:. "Learning to Reason without External Rewards"….

0

512

0

Yizhou Liu

@YizhouLiu0

3 months

RT @yafuly: 🎉 Excited to share our recent work: Scaling Reasoning, Losing Control. 🧠 LLMs get better at math… but worse at following instru….

0

20

0

Yizhou Liu

@YizhouLiu0

3 months

It would be more impressive that LLMs can do what they can today without any of these capabilities. And even if they do not have now, it may not be hard to develop the abilities in future….

Prof. Lee Cronin

@leecronin

3 months

It is trivial to explain why a LLM can never ever be conscious or intelligent. Utterly trivial. It goes like this - LLMs have zero causal power. Zero agency. Zero internal monologue. Zero abstracting ability. Zero understanding of the world. They are tools for conscious beings.

0

6

Yizhou Liu

@YizhouLiu0

3 months

Elegant mapping! We should believe the existence something universal behind large complex systems — large language models included.

Ziming Liu

@ZimingLiu11

3 months

Interested in the science of language models but tired of neural scaling laws? Here's a new perspective: our new paper presents neural thermodynamic laws -- thermodynamic concepts and laws naturally emerge in language model training!. AI is naturAl, not Artificial, after all.

0

1

13

Yizhou Liu

@YizhouLiu0

3 months

The study of neural scaling laws can be refined by distinguishing between width-limited and depth-limited regimes. In each regime, there should be loss decay behaviors with model size, dataset size, and training steps, highlighting the need for further investigation. (12/n).

2

0

18

Yizhou Liu

@YizhouLiu0

3 months

Pre-training loss is a key indicator of model performance, not the only metric of interest. At the same loss level, LLMs with different degrees of superposition may exhibit differences in emergent abilities such as reasoning or trainability via reinforcement learning. (11/n).

1

0

20

Yizhou Liu

@YizhouLiu0

3 months

Recognizing that superposition benefits LLMs, we propose that encouraging superposition could enable smaller models to match the performance of larger ones and make training more efficient. (10/n).

1

21

Yizhou Liu

@YizhouLiu0

3 months

If our framework accounts for the observed neural scaling laws, we suggest that this kind of scaling is reaching its limits, not because increasing model dimension is impossible, but because it is inefficient (9/n).

2

1

19

Yizhou Liu

@YizhouLiu0

3 months

LLMs agree with the toy model results in the strong superposition regime from the underlying overlaps between representations to the loss scaling with model dimension. (8/n)

1

0

12