Rattana Pukdee
@rpukdeee
Followers
55
Following
268
Media
1
Statuses
104
PhD student at @mldcmu 🐕🦺
Pittsburgh
Joined April 2014
🚨Excited to introduce a major development in building safer language models: Safety Pretraining! Instead of post-hoc alignment, we take a step back and embed safety directly into pretraining. 🧵(1/n)
7
90
342
1/6 Retrieval is supposed to improve generation in RAG systems. But in practice, adding more documents can hurt performance, even when relevant ones are retrieved. We introduce RAGGED, a framework to measure and diagnose when retrieval helps and when it hurts.
1
22
106
Link to paper: https://t.co/GV9FWeCKLt Joint work with @mahbodm_, Vishwajeet Agrawal, @VariciBurak , @RavikumarPrad
openreview.net
Self-supervised learning methods that mask parts of the input data and train models to predict the missing components have led to significant advances in machine learning. These approaches learn...
0
0
2
In our #AISTATS2025 paper, we ask: when it is possible to recover a consistent joint distribution from conditionals? We propose path consistency and autoregressive path consistency—necessary and easily verifiable conditions. See you at Poster session 3, Monday 5th May.
1
7
15
Excited to share new work from my internship @GoogleAI ! Curious as to how we should measure the similarity between examples in pretraining datasets? We study the role of similarity in pretraining 1.7B parameter language models on the Pile. arxiv: https://t.co/iyS3Fxtx9a 1/🧵
5
42
170
To trust LLMs in deployment (e.g., agentic frameworks or for generating synthetic data), we should predict how well they will perform. Our paper shows that we can do this by simply asking black-box models multiple follow-up questions! w/ @m_finzi and @zicokolter 1/ 🧵
4
42
118
Contrastive VLMs (CLIP) lack the structure of text embeddings, like satisfying analogies via arithmetic (king - man = queen). We enhance CLIP’s *reasoning abilities* on such tasks by finetuning w/ text descriptions of image differences! w/ D. Willmott, J.Semedo, @zicokolter 1/🧵
2
46
173
🧵 Are "medical" LLMs/VLMs *adapted* from general-domain models, always better at answering medical questions than the original models? In our oral presentation at #EMNLP2024 today (2:30pm in Tuttle), we'll show that surprisingly, the answer is "no". https://t.co/3259JyNU44
arxiv.org
Several recent works seek to develop foundation models specifically for medical applications, adapting general-purpose large language models (LLMs) and vision-language models (VLMs) via continued...
2
36
104
(1/N) Can LLMs tell you what features to use for predicting an outcome? In our work, we demonstrate that LLMs such as GPT-4 are capable of identifying predictive features for supervised learning tasks, even without access to the training data. w/ @zacharylipton @RavikumarPrad 🧵
3
11
31
One week away from @iclr_conf in Vienna 🤩 I will be presenting two spotlights: why big foundation models generalize so well under the self-supervised setting, and how to leverage massive unlabeled data using a base kernel that encodes inter-sample similarity. Details 👇 (1/3)
1
10
48
Estimating notions of unfairness/inequity is hard as it requires that data captures all features that influenced decision-making. But what if it doesn't? In our work ( https://t.co/YXXLlkbRRS), we answer this question w/ @dylanjsam @MichaelOberst @zacharylipton @brwilder
3
28
71
Unlabeled data is crucial for modern ML. It provides info about data distribution P, but how to exploit such info? Given a kernel K, our #ICLR2024 spotlight gives a general & principled way: Spectrally Transformed Kernel Regression (STKR). Camera-ready 👇 https://t.co/0IQZz1NiGn
arxiv.org
Unlabeled data is a key component of modern machine learning. In general, the role of unlabeled data is to impose a form of smoothness, usually from the similarity information encoded in a base...
1
13
59
What'd you do with an inter-sample similarity kernel, lots of unlabeled and little labeled data? Some might say kernel ridge regression (KRR), but KRR can't use unlabeled data by representer theorem. Our #ICLR2024 spotlight STKR gives an answer. A 🧵 (1/3) https://t.co/op78hiL61v
openreview.net
Unlabeled data is a key component of modern machine learning. In general, the role of unlabeled data is to impose a form of smoothness, usually from the similarity information encoded in a base...
1
4
13
Stable Diffusion is an effective data augmentation. Website: https://t.co/GJLFbOmGiX Watch Here: https://t.co/2lbPuysK4j I'm excited to share my NeurIPS talk about DA-Fusion from the Synthetic Data workshop, where we build an augmentation that semantically modifies images, and
1
24
62
Check out our #NeurIPS2023 paper "Learning with Explanation Constraints" with my co-author @rpukdeee, which explains how explanations of model behavior can help us from a learning-theoretic perspective! ( https://t.co/EKknX9EMzT) 🧵 (1/n)
2
16
58
DALL-E meets WALL-E: An Art History 1) Mona Lisa, Leonardo da Vinci
7
124
1K
"A Theory of PAC Learnability under Transformation Invariances" https://t.co/E5n5EQFpfS by Hao Shao, @montasser_omar and Avrim Blum; seems like one of the first papers studying optimal algorithms in terms of sample complexity under (group) transformation invariances.
arxiv.org
Transformation invariances are present in many real-world problems. For example, image classification is usually invariant to rotation and color transformation: a rotated car in a different color...
1
5
32
Materials for a comprehensive course on Geometric Deep Learning are available here: https://t.co/TS5q8fQwQF • 12 lectures • Taught by pioneers in the field (@mmbronstein, @TacoCohen, @joanbruna, @PetarV_93) • 100% free Check it out! 🚀
5
158
566
How to come up with research ideas? Excited about starting doing research but have no clue?🤷♂️🤷🏻♀️ Here are some simple methods that I found useful in identifying initial directions. Check out the thread below 👇
39
503
2K