Christina Baek @_christinabaek X Profile

Christina Baek

@_christinabaek

Followers

2K

Following

526

Media

26

Statuses

121

PhD student @mldcmu | intern @datologyai @GoogleAI | Robust ML

Joined June 2021

Don't wanna be here? Send us removal request.

Christina Baek

@_christinabaek

4 months

Are current reasoning models optimal for test-time scaling? 🌠.No! Models make the same incorrect guess over and over again. We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math!. 1/N

7

104

483

Christina Baek

@_christinabaek

18 days

RT @gaurav_ghosal: 1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with….

0

23

0

Christina Baek

@_christinabaek

19 days

RT @jen_hsia: 1/6 Retrieval is supposed to improve generation in RAG systems. But in practice, adding more documents can hurt performance,….

0

21

0

Christina Baek

@_christinabaek

19 days

RT @AdtRaghunathan: I will be at #ICML2025 🇨🇦 from Wednesday through Saturday. My students have a lot of exciting papers - check them out….

0

18

0

Christina Baek

@_christinabaek

24 days

RT @sukjun_hwang: Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical netw….

0

726

0

Christina Baek

@_christinabaek

2 months

RT @pratyushmaini: One of the dreams when joining @datologyai was to bring the fruits of data research from labs🔬 to the real world 🌎.Soo g….

0

4

0

Christina Baek

@_christinabaek

2 months

RT @anag004: PPO is often frustrating to tune for many continuous control tasks since it keeps getting stuck in local minima. In our SAPG….

0

1

0

Christina Baek

@_christinabaek

2 months

RT @AdtRaghunathan: Excited to speak at the CVPR workshop on domain generalization!.Estimating model performance in the wild is hard but cr….

0

4

0

Christina Baek

@_christinabaek

2 months

RT @_vaishnavh: 📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue:. → LLMs are limited in cre….

0

40

0

Christina Baek

@_christinabaek

2 months

RT @ZhengyangGeng: Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a….

0

37

0

Christina Baek

@_christinabaek

2 months

RT @yidingjiang: Data selection and curriculum learning can be formally viewed as a compression protocol via prequential coding. New blog….

yidingjiang.github.io

We describe a unified framework for data selection and curriculum learning via compression.

0

17

0

Christina Baek

@_christinabaek

3 months

RT @BingbinL: Excited to announce MOSS, our ICML workshop focused on discoveries at small scale! We believe there's tremendous potential &….

0

15

0

Christina Baek

@_christinabaek

3 months

RT @aleks_madry: Building AI systems is now a fragmented process spanning multiple organizations & entities. In new work (w/ @aspenkhopkin….

0

25

0

Christina Baek

@_christinabaek

3 months

RT @RuntianZhai: Why can foundation models transfer to so many downstream tasks? Will the scaling law end? Will pretraining end like Ilya S….

arxiv.org

This dissertation establishes the contexture theory to mathematically characterize the mechanism of representation learning, or pretraining. Despite the remarkable empirical success of foundation...

0

32

0

Christina Baek

@_christinabaek

3 months

Edit: We have a poster for this work at the #ICLR25 SSI-FM workshop at 9 am today!.

0

1

3

Christina Baek

@_christinabaek

3 months

RT @pratyushmaini: Join me & @hbxnov at #ICLR2025 for our very purple poster on risks of LLM evals by private companies!. 🕒 Today, 10am | 🪧….

0

5

0

Christina Baek

@_christinabaek

3 months

RT @rowankwang: SDF has limitations: models might recall their prior knowledge through reasoning or from their environment. Ex: we taught….

0

1

0

Christina Baek

@_christinabaek

3 months

When we train models to do QA, are we robustly improving context dependency? No!. In our ICLR Oral (Fri 11 AM), we show that if the base model knows the facts already, it shortcuts and learns to ignore the context completely! . Visit us to learn more about knowledge conflicts 😀

3

20

102

Christina Baek

@_christinabaek

3 months

RT @m_finzi: Why do larger language models generalize better? . In our new ICLR paper, we derive an interpretable generalization bound show….

arxiv.org

Why do larger language models generalize better? To investigate this question, we develop generalization bounds on the pretraining objective of large language models (LLMs) in the compute-optimal...

0

31

0

Christina Baek

@_christinabaek

4 months

RT @ashertrockman: Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your….

0

29

0

Christina Baek

@_christinabaek

4 months

RT @james_y_zou: Does RAG solve hallucination?. Even w/ RAG, we found that >30% of LLMs' medical statements are not fully supported by (som….

0

73

0