
Boaz Barak
@boazbaraktcs
Followers
22K
Following
11K
Media
756
Statuses
8K
Computer Scientist. See also https://t.co/EXWR5k634w . @harvard @openai opinions my own.
Cambridge, MA
Joined January 2020
RT @eleventhsavi0r: @Ysqander @boazbaraktcs @xai I have plenty of shocking examples. Here is one. This model is willing to readily give sel….
0
2
0
RT @WeakInteraction: This is something I've been saying for years, mostly in private, and usually to people more concerned with catastrophe….
0
2
0
With model activations, the default is inscrutability and if we work hard we can interpret some features. With chain of thought, the default is legibility and sometimes there are examples of unfaithful COTs. This is very good!.
Modern reasoning models think in plain English. Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems. I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.
0
0
7
RT @2020science: As someone who's worked at the cutting edge of getting new technologies right for decades it's crazy what we're seeing fro….
0
3
0
RT @boazbaraktcs: @ayedtay @xai To be fair, up to not too long ago, many of those things were not that important. The content that chatbots….
0
7
0
RT @dhadfieldmenell: I’m going to steal @boazbaraktcs’s analogy here, it’s a point that has been the center of my perspective on safety but….
0
3
0
RT @dhadfieldmenell: I also endorse these comments from @OpenAI employee (and @Harvard prof) @boazbaraktcs. @xai is clearly out of step wi….
0
2
0
RT @boazbaraktcs: People sometimes distinguish between "mundane safety" and "catastrophic risks", but in many cases they require exercising….
0
8
0
RT @boazbaraktcs: I can't believe I'm saying it but "mechahitler" is the smallest problem:. * There is no system card, no information about….
0
14
0
This is not about competition. Every other frontier lab - @OpenAI (where I work), @AnthropicAI, @GoogleDeepMind, @Meta at the very least publishes a model card with some evaluations. Even DeepSeek R1, which can be easily jailbroken, at least sometimes requires jailbreak. (And.
4
1
191
I didn't want to post on Grok safety since I work at a competitor, but it's not about competition. I appreciate the scientists and engineers at @xai but the way safety was handled is completely irresponsible. Thread below.
80
84
1K
RT @_aidan_clark_: Hi, .We’re delaying the open weights model. Capability wise, we think the model is phenomenal — but our bar for an open….
0
18
0
This study surprised me! The conclusion is opposite to what I would expect. It is tempting to try to find a reason it's bogus but I think it's well executed and solid work. As the authors say, there are a number of potential caveats for this setting that may not generalize.
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
6
8
91
Worth reading. A brave woman from Kabul.
Hello, X. Today, I’m taking over @SaraWahedi’s account from Kabul, Afghanistan. My name is Muzhda. I want to tell you about my final year of university when the Taliban banned classes. How I found a job in secret, . and how that job became one of the scariest decisions I made.
0
0
6
RT @davidmanheim: @boazbaraktcs Thank you!. We as a society need to normalize and support criticizing people we agree with for bad behavior….
0
2
0
RT @aymannadeem: I’m a YC/VC-backed founder. building is hard enough without tech cheering on open racism. saying vile things about musli….
0
155
0