WKitouni Profile Banner
Ouail Kitouni Profile
Ouail Kitouni

@WKitouni

Followers
65
Following
56
Media
16
Statuses
110

Member of technical staff @Anthropic prev @MIT @Meta @MSFTResearch

San Francisco, CA
Joined August 2019
Don't wanna be here? Send us removal request.
@WKitouni
Ouail Kitouni
1 year
RT @alexalbert__: Friday feature drop:. Highlight text or code within an Artifact and quickly have Claude improve or explain the selection.….
0
67
0
@WKitouni
Ouail Kitouni
1 year
We eating good tonight
Tweet media one
0
0
1
@WKitouni
Ouail Kitouni
1 year
RT @teortaxesTex: Thesis from @ilyasut : "to predict the next word, you have to predict the world".Antithesis from @ylecun : "AR-LLMs suck!….
0
21
0
@WKitouni
Ouail Kitouni
1 year
You can just use a different model to prioritize higher signal tokens and generalize quicker. RHO-LOSS literally just works. (fraction is ratio of top-k tokens kept to total tokens; 1 is equiv to no rho-loss used)
Tweet media one
2
1
12
@WKitouni
Ouail Kitouni
1 year
Empire State dragon??
Tweet media one
0
0
0
@WKitouni
Ouail Kitouni
1 year
Interesting future directions could be dynamic horizon selection (it’s much more difficult to predict the far future than it is to predict next token) so how do we interpolate from next-token to full on any-to-any effectively?.
0
0
2
@WKitouni
Ouail Kitouni
1 year
This simple change on top of BERT’s MLM makes the model a masked diffusion model on discrete states which has a nice correspondence with Permutation Language Modeling. PLM was notoriously difficult to train because permuted sequences are much harder to predict.
Tweet media one
1
0
1
@WKitouni
Ouail Kitouni
1 year
A simple change to what the model sees as input/target (the specific factorization the objective aims to optimize) resolves the reversal curse and allows a model to learn star-graph navigation (a task difficult to learn without changing the data)!.
1
0
0
@WKitouni
Ouail Kitouni
1 year
What if I told you they could store more if you tweak good ol’ MLM to something more modern like masked diffusion? What if I also told you it could help the model **plan**? eg. when you ask models to make predictions over longer horizons, they learn pathfinding on graphs
Tweet media one
1
0
0
@WKitouni
Ouail Kitouni
1 year
[🚨Masked Diffusion vs. GPT🚨].Don't predict next-token only, predict any-to-any. You'll get:.- Better knowledge storage.- No reversal curse.- Better planning.📄LLMs are good at storing information but not quite perfectly (hallucinations, reversal, etc).🧵
Tweet media one
2
3
11
@WKitouni
Ouail Kitouni
1 year
RT @summeryue0: 🚀 Introducing the SEAL Leaderboards! We rank LLMs using private datasets that can’t be gamed. Vetted experts handle the rat….
0
34
0
@WKitouni
Ouail Kitouni
1 year
0
0
2
@WKitouni
Ouail Kitouni
1 year
5/ We also observe a correspondence between Principal Components and known terms in nuclear theory:
Tweet media one
1
0
1
@WKitouni
Ouail Kitouni
1 year
4/ We observe a similar structured representations when training models on nuclear physics data. It turns out the model uses spirals as a geometric interpretation of the nuclear “liquid drop model”.
Tweet media one
1
0
1
@WKitouni
Ouail Kitouni
1 year
3/ In previous work, we found transformers learn interpretable algorithms for modular addition. In some cases we even see these extremely human-readable highly-structured representations:
Tweet media one
1
0
1
@WKitouni
Ouail Kitouni
1 year
2/ e.g., Can we study NN representations to (re)discover nuclear theory? We trained models on nuclear physics data and found that they learn representations strikingly similar to “human-derived” theory.
1
0
1
@WKitouni
Ouail Kitouni
1 year
1/ A lot of mech interp work lately focuses on understanding how language models work. A slightly different but fun question we wanted to explore in this paper Can interp say anything about models trained on scientific (specifically physics) data?.
1
0
1
@WKitouni
Ouail Kitouni
2 years
Repo to reproduce Grokking in a few lines of code (Full batch GD, small MLP, modular addition):
Tweet media one
0
0
1
@WKitouni
Ouail Kitouni
2 years
Understanding the Pareto frontier will be key here. Also see:
0
0
0
@WKitouni
Ouail Kitouni
2 years
I think we'll see more such results as we confront a fundamental alignment issue: There's an irreducible tradeoff btwn helpfulness & harmlessness. A good model provides some harmful content for the greater good, while a terrible model is constrained, upholding unnecessary rules.
@AnthropicAI
Anthropic
2 years
New Anthropic Paper: Sleeper Agents. We trained LLMs to act secretly malicious. We found that, despite our best efforts at alignment training, deception still slipped through.
Tweet media one
1
0
2