_christinabaek Profile Banner
Christina Baek Profile
Christina Baek

@_christinabaek

Followers
2K
Following
526
Media
26
Statuses
121

PhD student @mldcmu | intern @datologyai @GoogleAI | Robust ML

Joined June 2021
Don't wanna be here? Send us removal request.
@_christinabaek
Christina Baek
4 months
Are current reasoning models optimal for test-time scaling? 🌠.No! Models make the same incorrect guess over and over again. We show that you can fix this problem w/o any crazy tricks 💫 – just do weight ensembling (WiSE-FT) for big gains on math!. 1/N
Tweet media one
7
104
483
@_christinabaek
Christina Baek
18 days
RT @gaurav_ghosal: 1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with….
0
23
0
@_christinabaek
Christina Baek
19 days
RT @jen_hsia: 1/6 Retrieval is supposed to improve generation in RAG systems. But in practice, adding more documents can hurt performance,….
0
21
0
@_christinabaek
Christina Baek
19 days
RT @AdtRaghunathan: I will be at #ICML2025 🇨🇦 from Wednesday through Saturday. My students have a lot of exciting papers - check them out….
0
18
0
@_christinabaek
Christina Baek
24 days
RT @sukjun_hwang: Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical netw….
0
726
0
@_christinabaek
Christina Baek
2 months
RT @pratyushmaini: One of the dreams when joining @datologyai was to bring the fruits of data research from labs🔬 to the real world 🌎.Soo g….
0
4
0
@_christinabaek
Christina Baek
2 months
RT @anag004: PPO is often frustrating to tune for many continuous control tasks since it keeps getting stuck in local minima. In our SAPG….
0
1
0
@_christinabaek
Christina Baek
2 months
RT @AdtRaghunathan: Excited to speak at the CVPR workshop on domain generalization!.Estimating model performance in the wild is hard but cr….
0
4
0
@_christinabaek
Christina Baek
2 months
RT @_vaishnavh: 📢 New paper on creativity & multi-token prediction! We design minimal open-ended tasks to argue:. → LLMs are limited in cre….
0
40
0
@_christinabaek
Christina Baek
2 months
RT @ZhengyangGeng: Excited to share our work with my amazing collaborators, @Goodeat258, @SimulatedAnneal, @zicokolter, and Kaiming. In a….
0
37
0
@_christinabaek
Christina Baek
2 months
RT @yidingjiang: Data selection and curriculum learning can be formally viewed as a compression protocol via prequential coding. New blog….
yidingjiang.github.io
We describe a unified framework for data selection and curriculum learning via compression.
0
17
0
@_christinabaek
Christina Baek
3 months
RT @BingbinL: Excited to announce MOSS, our ICML workshop focused on discoveries at small scale! We believe there's tremendous potential &….
0
15
0
@_christinabaek
Christina Baek
3 months
RT @aleks_madry: Building AI systems is now a fragmented process spanning multiple organizations & entities. In new work (w/ @aspenkhopkin….
0
25
0
@_christinabaek
Christina Baek
3 months
RT @RuntianZhai: Why can foundation models transfer to so many downstream tasks? Will the scaling law end? Will pretraining end like Ilya S….
Tweet card summary image
arxiv.org
This dissertation establishes the contexture theory to mathematically characterize the mechanism of representation learning, or pretraining. Despite the remarkable empirical success of foundation...
0
32
0
@_christinabaek
Christina Baek
3 months
Edit: We have a poster for this work at the #ICLR25 SSI-FM workshop at 9 am today!.
0
1
3
@_christinabaek
Christina Baek
3 months
RT @pratyushmaini: Join me & @hbxnov at #ICLR2025 for our very purple poster on risks of LLM evals by private companies!. 🕒 Today, 10am | 🪧….
0
5
0
@_christinabaek
Christina Baek
3 months
RT @rowankwang: SDF has limitations: models might recall their prior knowledge through reasoning or from their environment. Ex: we taught….
0
1
0
@_christinabaek
Christina Baek
3 months
When we train models to do QA, are we robustly improving context dependency? No!. In our ICLR Oral (Fri 11 AM), we show that if the base model knows the facts already, it shortcuts and learns to ignore the context completely! . Visit us to learn more about knowledge conflicts 😀
Tweet media one
3
20
102
@_christinabaek
Christina Baek
3 months
RT @m_finzi: Why do larger language models generalize better? . In our new ICLR paper, we derive an interpretable generalization bound show….
Tweet card summary image
arxiv.org
Why do larger language models generalize better? To investigate this question, we develop generalization bounds on the pretraining objective of large language models (LLMs) in the compute-optimal...
0
31
0
@_christinabaek
Christina Baek
4 months
RT @ashertrockman: Are you a frontier lab investing untold sums in training? Are you trying to stay competitive? Are you finding that your….
0
29
0
@_christinabaek
Christina Baek
4 months
RT @james_y_zou: Does RAG solve hallucination?. Even w/ RAG, we found that >30% of LLMs' medical statements are not fully supported by (som….
0
73
0