Ben Harack @benharack X Profile

Ben Harack

@benharack

Followers

277

Following

4K

Media

13

Statuses

1K

International Relations & AI. @aigioxford Within DPhil @Politics_Oxford. Former @hdx. Not here often. Find me at https://t.co/XOTH3Zrv4C

https://t.co/iUck0PV8tQ

Oxford, UK

Joined March 2009

Don't wanna be here? Send us removal request.

Fazl Barez 🔜 @NeurIPS

@FazlBarez

1 month

🚨New AI Safety Course @aims_oxford! I’m thrilled to launch a new called AI Safety & Alignment (AISAA) course on the foundations & frontier research of making advanced AI systems safe and aligned at @UniofOxford what to expect 👇 https://t.co/r9YHS3XJhR

7

21

110

Janvi Ahuja

@jn_ahuja

4 months

I started this work as a verification skeptic. But, being able to signal benignness (as @Miles_Brundage puts it) will likely be important in both national and foreign policy contexts. Happy to have been a small part of this massive undertaking by @BenHarack.

Yoshua Bengio

@Yoshua_Bengio

4 months

The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️

1

21

Yoshua Bengio

@Yoshua_Bengio

4 months

The future of AI governance may hinge on our ability to develop trusted and effective ways to make credible claims about AI systems. This new report expands our understanding of the verification challenge and maps out compelling areas for further work. ⬇️

Ben Harack

@benharack

4 months

Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/

12

21

118

Ben Harack

@benharack

4 months

16/ @janet_e_egan @ben_s_bucknall @rosen_br @araujonrenan @BoulaninSIPRI Ranjit Lall @FazlBarez Sanaa Alvira @Corin_Katzke Ahmad Atamli Amro Awad /end🧵

0

9

Ben Harack

@benharack

4 months

15/ Thanks to @aigioxford for backing this project and all my coauthors: @RobertTrager @AnkaReuel @davidmanheim @Miles_Brundage @onni_aarne @aaronscher Yanliang Pan @jennywxiao @kristy_loke @SumayaNur_ @gbasg_ @nickacaputo @JuliaCMorse @jn_ahuja @IsabellaDuan

1

0

9

Ben Harack

@benharack

4 months

14/ The full report is available here:

aigi.ox.ac.uk

The growing impacts of artificial intelligence (AI) are spurring states to consider international agreements that could help manage this rapidly evolving technology. The political feasibility of such...

1

0

8

Ben Harack

@benharack

4 months

13/ Those who lived through or studied the Cold War may remember President Reagan reiterating the Russian proverb “Trust, but verify.” Just as it was with 1980s nuclear arms control, our ability to build new verification systems may be crucial for preserving peace today.

1

0

7

Ben Harack

@benharack

4 months

12/ If we build these more serious verification systems, we would be laying the foundation for international agreements over AI—which might end up being the most important international deals in the history of humanity.

1

0

6

Ben Harack

@benharack

4 months

11/ It seems possible to create similar verification exchanges that preserve security to an extreme degree, but we’ll need political action to get there. Our report goes into this in some detail. These setups might take about 1-3 years of intense effort to research and build.

1

0

7

Ben Harack

@benharack

4 months

10/ However, even if we scale this up, the most important secrets (think national security info, military AI models, or the Coca-Cola formula) are probably too sensitive to govern via just confidential computing. Further work is needed to safeguard these.

1

0

6

Ben Harack

@benharack

4 months

9/ Groups that use AI (including corporations and countries) will likewise place more trust in AI services that they can be sure are secure and appropriately governed. They may also request—or demand—this kind of thing in the future.

1

0

6

Ben Harack

@benharack

4 months

8/ This setup allows 1) users to feel safe and confident about services they pay for, 2) companies to expand their offerings to more sensitive domains, and 3) governments to check that rules are followed.

1

0

6

Ben Harack

@benharack

4 months

7/ An AI provider can prove that they abide by rules by having a set of third parties (e.g., AI testing companies and AI Safety / Security Institutes) securely test their models and systems. A user can trust a group of third parties a *lot* more than they trust the AI provider.

1

0

6

Ben Harack

@benharack

4 months

6/ Confidential computing might be reliable enough for a company to make pretty strong claims about what they are *doing* (e.g., serving you inference with a given model and compute budget) and what they are *not doing* (e.g., copying your data).

1

0

6

Ben Harack

@benharack

4 months

5/ Some of these technologies can be deployed *today*, such as confidential computing, which is available in recent hardware such as NVIDIA’s Hopper or Blackwell chips. These are good enough to get us started.

1

0

7

Ben Harack

@benharack

4 months

4/ Luckily, decades of work has gone into privacy-preserving computational methods. Basically they are tricks with hardware and cryptography that allow one actor (Prover) to prove to another actor (Verifier) something without revealing all the underlying data.

1

0

9

Ben Harack

@benharack

4 months

3/ But countries care about their security, so we can’t expect them to simply hand over all the information needed to prove that they’re following governance rules.

1

0

6

Ben Harack

@benharack

4 months

2/ International AI governance is desirable (for peace, security, and good lives), but it faces verification challenges because there’s no easy way to understand what someone else is doing on their computer without violating their security.

1

0

6

Ben Harack

@benharack

4 months

Governing AI requires international agreements, but cooperation can be risky if there’s no basis for trust. Our new report looks at how to verify compliance with AI agreements without sacrificing national security. This is neither impossible nor trivial.🧵 1/

3

33

94

Fazl Barez 🔜 @NeurIPS

@FazlBarez

4 months

Excited to share our paper: "Chain-of-Thought Is Not Explainability"! We unpack a critical misconception in AI: models explaining their Chain-of-Thought (CoT) steps aren't necessarily revealing their true reasoning. Spoiler: transparency of CoT can be an illusion. (1/9) 🧵

28

137

658