Victor Lecomte @vclecomte X Profile

Victor Lecomte

@vclecomte

Followers

661

Following

5K

Media

31

Statuses

244

CS PhD student at Stanford / Researcher at the Alignment Research Center

https://t.co/6sqXI5Jhug

Joined March 2017

Don't wanna be here? Send us removal request.

Victor Lecomte

@vclecomte

5 months

A cute question about inner product sketching came up in our research; any leads would be appreciated! 🙂

0

1

Rob Wiblin

@robertwiblin

6 months

A new legal letter aimed at OpenAI lays out in stark terms the money and power grab OpenAI is trying to trick its board members into accepting — what one analyst calls "the theft of the millennium." The simple facts of the case are both devastating and darkly hilarious. I'll

438

1K

5K

Ryan Greenblatt

@RyanPGreenblatt

10 months

New Redwood Research (@redwood_ai) paper in collaboration with @AnthropicAI: We demonstrate cases where Claude fakes alignment when it strongly dislikes what it is being trained to do. (Thread)

Anthropic

@AnthropicAI

10 months

New Anthropic research: Alignment faking in large language models. In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.

11

45

362

METR

@METR_Evals

11 months

How close are current AI agents to automating AI R&D? Our new ML research engineering benchmark (RE-Bench) addresses this question by directly comparing frontier models such as Claude 3.5 Sonnet and o1-preview with 50+ human experts on 7 challenging research engineering tasks.

15

173

841

Gabriel Wu

@GabrielDWu1

1 year

The Alignment Research Center (ARC) just released our first empirical paper: Estimating the Probabilities of Rare Outputs in Language Models. In this thread, I'll motivate the problem of low probability estimation and describe our setting/methods. 🧵

1

8

135

Eric Neyman

@ericneyman

1 year

Last week, ARC put out a new paper! The paper is a research update on the "heuristic estimation" direction of our research into explaining neural network behavior. The paper starts by explaining what we mean by "heuristic estimation", through an example and three analogies 🧵

1

6

27

Victor Lecomte

@vclecomte

1 year

(In fact, I set up my custom domain in October 2021, less than one month before GitHub Pages added the ability to verify custom domains: https://t.co/krZcV37dq1)

0

Victor Lecomte

@vclecomte

1 year

And GitHub never informed me that my account had been temporarily demoted, or that the custom domain had been taken over by another user. I've verified the domain name now, but it's kind of wild to me that the system was built on such shaky ground to start with. [3/3]

1

0

Victor Lecomte

@vclecomte

1 year

status for <24h because GitHub took too long to reverify my educational discount, which temporarily disabled the private repository that my website is built on. This allowed another GitHub user to swoop in and claim the domain. [2/?]

1

0

Victor Lecomte

@vclecomte

1 year

Website is back up at https://t.co/1uKeL3u4WW! The issue was that back when I set up the domain, GitHub Pages had no way to verify that the user who publishes to a custom domain name is the owner of that domain name. And apparently, earlier this month, I lost my GitHub Pro [1/?]

1

0

Victor Lecomte

@vclecomte

1 year

Service update: Looks like my custom domain name is currently serving a scam site, despite everything looking fine on Hover. I'll try to fix this as soon as possible, but in the meantime, if you want to access any of my notes, please use https://t.co/G3ApOxdZ9p. Sorry!

victorlecomte.com

1

0

Victor Lecomte

@vclecomte

1 year

This is the first paper I've worked on at ARC, and I think it's pretty cool! 🙂

Jacob Hilton

@JacobHHilton

1 year

ARC's latest ML theory paper is on a formal backdoor detection game between an attacker, who must insert a backdoor into a function at a randomly-chosen input, and a defender, who must detect when the backdoor is active at runtime. (Thread)

0

6

Michael Cohen

@Michael05156007

1 year

We sent this letter to @GavinNewsom this morning. He should sign SB 1047! 🧵

11

32

112

Kushal Thaman

@kushal1t

2 years

Excited to share the first paper of my undergrad: "Incidental Polysemanticity" https://t.co/0UdguIhGHK! We present a second, "incidental" origin story of polysemanticity in task-optimized DNNs. Done in collaboration with @vclecomte @tmychow @RylanSchaeffer @sanmikoyejo (1/n)

Rylan Schaeffer

@RylanSchaeffer

2 years

Interested in mech interp of representations that deep networks learn? If so, check out a new type of polysemanticity we call: 💥💥Incidental Polysemanticity 💥💥 Led by @vclecomte @kushal1t @tmychow @sanmikoyejo at @stai_research @StanfordAILab https://t.co/6uTzEj15Ev 1/N

9

7

43

Victor Lecomte

@vclecomte

2 years

My first dabble at studying learning dynamics (and at AI safety-related work)! It was a lot of fun figuring out the exact speed at which encodings get sparser under L1-regularization; I didn't expect the math to end up being so nice. 🙂

Rylan Schaeffer

@RylanSchaeffer

2 years

Interested in mech interp of representations that deep networks learn? If so, check out a new type of polysemanticity we call: 💥💥Incidental Polysemanticity 💥💥 Led by @vclecomte @kushal1t @tmychow @sanmikoyejo at @stai_research @StanfordAILab https://t.co/6uTzEj15Ev 1/N

1

11

Victor Lecomte

@vclecomte

2 years

I've been getting into ML theory recently, and it was a pleasure to learn from @RylanSchaeffer while working on this project!

Rylan Schaeffer

@RylanSchaeffer

2 years

Excited to begin announcing our #NeurIPS2023 workshop & conference papers (1/10)! 🔥🚀An Information-Theoretic Understanding of Maximum Manifold Capacity Representations🚀🔥 w/ amazing cast @vclecomte @BerivanISIK @sanmikoyejo @ziv_ravid @Andr3yGR @KhonaMikail @ylecun 1/7

0

12

Victor Lecomte

@vclecomte

2 years

Aside from that, I also wrote a note about Shearer's lemma ( https://t.co/gnFQOJ8ETM) and one about Gilmer's breakthrough lower bound on the union-closed set conjecture ( https://t.co/oU79uHGqF0), which uses an entropic argument.

0

1

Victor Lecomte

@vclecomte

2 years

It ends up being the right quantity to look at for proving a bunch of things, including the edge-isoperimetric inequality ( https://t.co/iICn7H1bus), the level-1 inequality ( https://t.co/gSmArDZWxp), and through that, even the Hoeffding bound ( https://t.co/qG2ckzqYxz)!

1

0

1

Victor Lecomte

@vclecomte

2 years

In particular I got really into "entropy deficit", which is the KL divergence between a variable X and the uniform distribution (or equivalently, "how much less entropy X has compared to uniform").

1

0

Victor Lecomte

@vclecomte

2 years

... starting with a bunch of notes about information theory ( https://t.co/tDL2uMKxUp). I wrote notes on basic notions like entropy, mutual information and KL divergence.

1

0

1