Victor Lecomte Profile
Victor Lecomte

@vclecomte

Followers
661
Following
5K
Media
31
Statuses
244

CS PhD student at Stanford / Researcher at the Alignment Research Center

Joined March 2017
Don't wanna be here? Send us removal request.
@vclecomte
Victor Lecomte
5 months
A cute question about inner product sketching came up in our research; any leads would be appreciated! 🙂
0
1
1
@robertwiblin
Rob Wiblin
6 months
A new legal letter aimed at OpenAI lays out in stark terms the money and power grab OpenAI is trying to trick its board members into accepting — what one analyst calls "the theft of the millennium." The simple facts of the case are both devastating and darkly hilarious. I'll
438
1K
5K
@RyanPGreenblatt
Ryan Greenblatt
10 months
New Redwood Research (@redwood_ai) paper in collaboration with @AnthropicAI: We demonstrate cases where Claude fakes alignment when it strongly dislikes what it is being trained to do. (Thread)
@AnthropicAI
Anthropic
10 months
New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.
11
45
362
@METR_Evals
METR
11 months
How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.
15
173
841
@GabrielDWu1
Gabriel Wu
1 year
The Alignment Research Center (ARC) just released our first empirical paper: Estimating the Probabilities of Rare Outputs in Language Models. In this thread, I'll motivate the problem of low probability estimation and describe our setting/methods. 🧵
1
8
135
@ericneyman
Eric Neyman
1 year
Last week, ARC put out a new paper! The paper is a research update on the "heuristic estimation" direction of our research into explaining neural network behavior. The paper starts by explaining what we mean by "heuristic estimation", through an example and three analogies 🧵
1
6
27
@vclecomte
Victor Lecomte
1 year
(In fact, I set up my custom domain in October 2021, less than one month before GitHub Pages added the ability to verify custom domains: https://t.co/krZcV37dq1)
0
0
0
@vclecomte
Victor Lecomte
1 year
And GitHub never informed me that my account had been temporarily demoted, or that the custom domain had been taken over by another user. I've verified the domain name now, but it's kind of wild to me that the system was built on such shaky ground to start with. [3/3]
1
0
0
@vclecomte
Victor Lecomte
1 year
status for <24h because GitHub took too long to reverify my educational discount, which temporarily disabled the private repository that my website is built on. This allowed another GitHub user to swoop in and claim the domain. [2/?]
1
0
0
@vclecomte
Victor Lecomte
1 year
Website is back up at https://t.co/1uKeL3u4WW! The issue was that back when I set up the domain, GitHub Pages had no way to verify that the user who publishes to a custom domain name is the owner of that domain name. And apparently, earlier this month, I lost my GitHub Pro [1/?]
1
0
0
@vclecomte
Victor Lecomte
1 year
Service update: Looks like my custom domain name is currently serving a scam site, despite everything looking fine on Hover. I'll try to fix this as soon as possible, but in the meantime, if you want to access any of my notes, please use https://t.co/G3ApOxdZ9p. Sorry!
victorlecomte.com
1
0
0
@vclecomte
Victor Lecomte
1 year
This is the first paper I've worked on at ARC, and I think it's pretty cool! 🙂
@JacobHHilton
Jacob Hilton
1 year
ARC's latest ML theory paper is on a formal backdoor detection game between an attacker, who must insert a backdoor into a function at a randomly-chosen input, and a defender, who must detect when the backdoor is active at runtime. (Thread)
0
0
6
@Michael05156007
Michael Cohen
1 year
We sent this letter to @GavinNewsom this morning. He should sign SB 1047! 🧵
11
32
112
@kushal1t
Kushal Thaman
2 years
Excited to share the first paper of my undergrad: "Incidental Polysemanticity" https://t.co/0UdguIhGHK! We present a second, "incidental" origin story of polysemanticity in task-optimized DNNs. Done in collaboration with @vclecomte @tmychow @RylanSchaeffer @sanmikoyejo (1/n)
@RylanSchaeffer
Rylan Schaeffer
2 years
Interested in mech interp of representations that deep networks learn? If so, check out a new type of polysemanticity we call: 💥💥Incidental Polysemanticity 💥💥 Led by @vclecomte @kushal1t @tmychow @sanmikoyejo at @stai_research @StanfordAILab https://t.co/6uTzEj15Ev 1/N
9
7
43
@vclecomte
Victor Lecomte
2 years
My first dabble at studying learning dynamics (and at AI safety-related work)! It was a lot of fun figuring out the exact speed at which encodings get sparser under L1-regularization; I didn't expect the math to end up being so nice. 🙂
@RylanSchaeffer
Rylan Schaeffer
2 years
Interested in mech interp of representations that deep networks learn? If so, check out a new type of polysemanticity we call: 💥💥Incidental Polysemanticity 💥💥 Led by @vclecomte @kushal1t @tmychow @sanmikoyejo at @stai_research @StanfordAILab https://t.co/6uTzEj15Ev 1/N
1
1
11
@vclecomte
Victor Lecomte
2 years
I've been getting into ML theory recently, and it was a pleasure to learn from @RylanSchaeffer while working on this project!
@RylanSchaeffer
Rylan Schaeffer
2 years
Excited to begin announcing our #NeurIPS2023 workshop & conference papers (1/10)! 🔥🚀An Information-Theoretic Understanding of Maximum Manifold Capacity Representations🚀🔥 w/ amazing cast @vclecomte @BerivanISIK @sanmikoyejo @ziv_ravid @Andr3yGR @KhonaMikail @ylecun 1/7
0
0
12
@vclecomte
Victor Lecomte
2 years
Aside from that, I also wrote a note about Shearer's lemma ( https://t.co/gnFQOJ8ETM) and one about Gilmer's breakthrough lower bound on the union-closed set conjecture ( https://t.co/oU79uHGqF0), which uses an entropic argument.
0
0
1
@vclecomte
Victor Lecomte
2 years
It ends up being the right quantity to look at for proving a bunch of things, including the edge-isoperimetric inequality ( https://t.co/iICn7H1bus), the level-1 inequality ( https://t.co/gSmArDZWxp), and through that, even the Hoeffding bound ( https://t.co/qG2ckzqYxz)!
1
0
1
@vclecomte
Victor Lecomte
2 years
In particular I got really into "entropy deficit", which is the KL divergence between a variable X and the uniform distribution (or equivalently, "how much less entropy X has compared to uniform").
1
0
0
@vclecomte
Victor Lecomte
2 years
... starting with a bunch of notes about information theory ( https://t.co/tDL2uMKxUp). I wrote notes on basic notions like entropy, mutual information and KL divergence.
1
0
1