Ben Harack Profile
Ben Harack

@benharack

Followers
277
Following
4K
Media
13
Statuses
1K

International Relations & AI. @aigioxford Within DPhil @Politics_Oxford. Former @hdx. Not here often. Find me at https://t.co/XOTH3Zrv4C

Oxford, UK
Joined March 2009
Don't wanna be here? Send us removal request.
@FazlBarez
Fazl Barez 🔜 @NeurIPS
1 month
🚨New AI Safety Course @aims_oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at @UniofOxford what to expect 👇 https://t.co/r9YHS3XJhR
7
21
110
@jn_ahuja
Janvi Ahuja
4 months
I started this work as a verification skeptic. But, being able to signal benignness (as @Miles_Brundage puts it) will likely be important in both national and foreign policy contexts. Happy to have been a small part of this massive undertaking by @BenHarack.
@Yoshua_Bengio
Yoshua Bengio
4 months
The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️
1
1
21
@Yoshua_Bengio
Yoshua Bengio
4 months
The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️
@benharack
Ben Harack
4 months
Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/
12
21
118
@benharack
Ben Harack
4 months
16/ @janet_e_egan @ben_s_bucknall @rosen_br @araujonrenan @BoulaninSIPRI Ranjit Lall @FazlBarez Sanaa Alvira @Corin_Katzke Ahmad Atamli Amro Awad /endđź§µ
0
0
9
@benharack
Ben Harack
4 months
13/ Those who lived through or studied the Cold War may remember President Reagan reiterating the Russian proverb “Trust, but verify.” Just as it was with 1980s nuclear arms control, our ability to build new verification systems may be crucial for preserving peace today.
1
0
7
@benharack
Ben Harack
4 months
12/ If we build these more serious verification systems, we would be laying the foundation for international agreements over AI—which might end up being the most important international deals in the history of humanity.
1
0
6
@benharack
Ben Harack
4 months
11/ It seems possible to create similar verification exchanges that preserve security to an extreme degree, but we’ll need political action to get there. Our report goes into this in some detail. These setups might take about 1-3 years of intense effort to research and build.
1
0
7
@benharack
Ben Harack
4 months
10/ However, even if we scale this up, the most important secrets (think national security info, military AI models, or the Coca-Cola formula) are probably too sensitive to govern via just confidential computing. Further work is needed to safeguard these.
1
0
6
@benharack
Ben Harack
4 months
9/ Groups that use AI (including corporations and countries) will likewise place more trust in AI services that they can be sure are secure and appropriately governed. They may also request—or demand—this kind of thing in the future.
1
0
6
@benharack
Ben Harack
4 months
8/ This setup allows 1) users to feel safe and confident about services they pay for, 2) companies to expand their offerings to more sensitive domains, and 3) governments to check that rules are followed.
1
0
6
@benharack
Ben Harack
4 months
7/ An AI provider can prove that they abide by rules by having a set of third parties (e.g., AI testing companies and AI Safety / Security Institutes) securely test their models and systems. A user can trust a group of third parties a *lot* more than they trust the AI provider.
1
0
6
@benharack
Ben Harack
4 months
6/ Confidential computing might be reliable enough for a company to make pretty strong claims about what they are *doing* (e.g., serving you inference with a given model and compute budget) and what they are *not doing* (e.g., copying your data).
1
0
6
@benharack
Ben Harack
4 months
5/ Some of these technologies can be deployed *today*, such as confidential computing, which is available in recent hardware such as NVIDIA’s Hopper or Blackwell chips. These are good enough to get us started.
1
0
7
@benharack
Ben Harack
4 months
4/ Luckily, decades of work has gone into privacy-preserving computational methods. Basically they are tricks with hardware and cryptography that allow one actor (Prover) to prove to another actor (Verifier) something without revealing all the underlying data.
1
0
9
@benharack
Ben Harack
4 months
3/ But countries care about their security, so we can’t expect them to simply hand over all the information needed to prove that they’re following governance rules.
1
0
6
@benharack
Ben Harack
4 months
2/ International AI governance is desirable (for peace, security, and good lives), but it faces verification challenges because there’s no easy way to understand what someone else is doing on their computer without violating their security.
1
0
6
@benharack
Ben Harack
4 months
Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/
3
33
94
@FazlBarez
Fazl Barez 🔜 @NeurIPS
4 months
Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) đź§µ
28
137
658