
Quinn Dougherty (UK)
@qd_forall
Followers
105
Following
2K
Media
62
Statuses
1K
Bridging the cultures between formal verification and AI. My p(doom) is 50% cuz it either happens or it doesn't. https://t.co/NlknpJL8D6
Berkeley, CA
Joined August 2023
Today, we're releasing Proving the Coding Interview on arxiv and huggingface. We believe FVAPPS is currently the largest formal verification benchmark, consisting of leetcode-style problems in @leanprover.
1
7
26
RT @AISecurityInst: White box methods can be a useful complement to black box monitoring, especially if:. ❌Chains of thought do not reflect….
0
1
0
This happened to me last week metaprogramming in lean.
My MATS scholars are teaching me such valuable things about Claude Code as a research tool!. Pro: Much faster research results - productivity is off the charts.Con: Often the most interesting results are hard-coded. (Credit to @edturner42 for seeing through Claude's lies).
0
0
1
RT @qd_forall: @NathanpmYoung @DKokotajlo So idk when people say "thats all vibes" or "so n so was wrong about the events", in my brain im….
0
1
0
RT @qd_forall: @NathanpmYoung @DKokotajlo I literally spent intimate grind moments with the quantification of surprise for several hours, a….
0
1
0
@NathanpmYoung @DKokotajlo Im obviously not doing the calculation all the time, but I can kinda see and feel shapes of certain sizes in my head when I picture accuracy, calibration, information contribution, etc.
0
0
0
@NathanpmYoung @DKokotajlo So idk when people say "thats all vibes" or "so n so was wrong about the events", in my brain im thinking about a distribution's accurate information contribution once you plug in the ground truth, reminiscing about the formalism and numerical properties of KL divergence.
1
1
0
@NathanpmYoung @DKokotajlo I literally spent intimate grind moments with the quantification of surprise for several hours, and I rely on that inside view when I reason about how many bayes points to award various hot takes as the world changes. I really dont think id be able to do that if I hadnt!!.
1
1
0
@NathanpmYoung @DKokotajlo And im kinda worried that it's a niche, sophisticated skill? I only have it cuz i wrote out the integrals for the KLDivergence numerical tests in the squiggle codebase back when I worked there
1
0
0
@NathanpmYoung @DKokotajlo What im interested in is what mental tools are required for members of the audience to say "the point estimate was off by a bounded amount, but it was still reasonably somewhere in the distribution, therefore was valuable info/estimate in expectation".
1
0
0
@NathanpmYoung @DKokotajlo I noticed watching the video that we expect the essay to be wrong in specifics like the success or failure of an exfiltration attack, etc. Daniel himself has said 2028 makes more sense, since the thing was published.
1
0
1
Some people say AI2027 is all vibes and not research. There's some sense in which this is true as @NathanpmYoung says "forecasting is just vibes plus track record", but it's still a misleading statement, because @DKokotajlo has that track record and cuz rigor in the estimates.
1
0
1
Exciting web content! I have a random AI2027/forecasting thought, thread:.
If I may say so myself, it’s immersive, beautiful and compelling, with interviews to put the whole thing in context and a banger discussion of what a sane world would be doing. Huge props to our host of AI in Context, @AricFloyd !.
1
0
0