boazbaraktcs Profile Banner
Boaz Barak Profile
Boaz Barak

@boazbaraktcs

Followers
22K
Following
11K
Media
756
Statuses
8K

Computer Scientist. See also https://t.co/EXWR5k634w . @harvard @openai opinions my own.

Cambridge, MA
Joined January 2020
Don't wanna be here? Send us removal request.
@boazbaraktcs
Boaz Barak
4 hours
RT @eleventhsavi0r: @Ysqander @boazbaraktcs @xai I have plenty of shocking examples. Here is one. This model is willing to readily give sel….
0
2
0
@boazbaraktcs
Boaz Barak
5 hours
RT @WeakInteraction: This is something I've been saying for years, mostly in private, and usually to people more concerned with catastrophe….
0
2
0
@boazbaraktcs
Boaz Barak
8 hours
With model activations, the default is inscrutability and if we work hard we can interpret some features. With chain of thought, the default is legibility and sometimes there are examples of unfaithful COTs. This is very good!.
@bobabowen
Bowen Baker
11 hours
Modern reasoning models think in plain English. Monitoring their thoughts could be a powerful, yet fragile, tool for overseeing future AI systems. I and researchers across many organizations think we should work to evaluate, preserve, and even improve CoT monitorability.
Tweet media one
0
0
7
@boazbaraktcs
Boaz Barak
9 hours
RT @2020science: As someone who's worked at the cutting edge of getting new technologies right for decades it's crazy what we're seeing fro….
0
3
0
@boazbaraktcs
Boaz Barak
9 hours
RT @boazbaraktcs: @ayedtay @xai To be fair, up to not too long ago, many of those things were not that important. The content that chatbots….
0
7
0
@boazbaraktcs
Boaz Barak
9 hours
RT @dhadfieldmenell: I’m going to steal @boazbaraktcs’s analogy here, it’s a point that has been the center of my perspective on safety but….
0
3
0
@boazbaraktcs
Boaz Barak
9 hours
RT @dhadfieldmenell: I also endorse these comments from @OpenAI employee (and @Harvard prof) @boazbaraktcs. @xai is clearly out of step wi….
0
2
0
@boazbaraktcs
Boaz Barak
10 hours
RT @boazbaraktcs: People sometimes distinguish between "mundane safety" and "catastrophic risks", but in many cases they require exercising….
0
8
0
@boazbaraktcs
Boaz Barak
10 hours
RT @boazbaraktcs: I can't believe I'm saying it but "mechahitler" is the smallest problem:. * There is no system card, no information about….
0
14
0
@boazbaraktcs
Boaz Barak
10 hours
People sometimes distinguish between "mundane safety" and "catastrophic risks", but in many cases they require exercising the same muscles: we need to evaluate models for risks, transparency on results, research mitigations, have monitoring post deployment. If as an industry we.
6
8
179
@boazbaraktcs
Boaz Barak
10 hours
This is not about competition. Every other frontier lab - @OpenAI (where I work), @AnthropicAI, @GoogleDeepMind, @Meta at the very least publishes a model card with some evaluations. Even DeepSeek R1, which can be easily jailbroken, at least sometimes requires jailbreak. (And.
4
1
191
@boazbaraktcs
Boaz Barak
10 hours
I can't believe I'm saying it but "mechahitler" is the smallest problem:. * There is no system card, no information about any safety or dangerous capability evals. * Unclear if any safety training was done. Model offers advice chemical weapons, drugs, or suicide methods. * The.
10
14
324
@boazbaraktcs
Boaz Barak
10 hours
I didn't want to post on Grok safety since I work at a competitor, but it's not about competition. I appreciate the scientists and engineers at @xai but the way safety was handled is completely irresponsible. Thread below.
80
84
1K
@boazbaraktcs
Boaz Barak
4 days
RT @_aidan_clark_: Hi, .We’re delaying the open weights model. Capability wise, we think the model is phenomenal — but our bar for an open….
0
18
0
@boazbaraktcs
Boaz Barak
5 days
This study surprised me! The conclusion is opposite to what I would expect. It is tempting to try to find a reason it's bogus but I think it's well executed and solid work. As the authors say, there are a number of potential caveats for this setting that may not generalize.
@METR_Evals
METR
5 days
We ran a randomized controlled trial to see how much AI coding tools speed up experienced open-source developers. The results surprised us: Developers thought they were 20% faster with AI tools, but they were actually 19% slower when they had access to AI than when they didn't.
Tweet media one
6
8
91
@boazbaraktcs
Boaz Barak
10 days
My guess is that in schools that implement this, it takes a short time until “genius” becomes a curse word kids use at each other.
@tracewoodgrains
TracingWoodgrains
10 days
this approach is genius
Tweet media one
2
0
31
@boazbaraktcs
Boaz Barak
10 days
Worth reading. A brave woman from Kabul.
@SaraWahedi
Sara Wahedi
10 days
Hello, X. Today, I’m taking over @SaraWahedi’s account from Kabul, Afghanistan. My name is Muzhda. I want to tell you about my final year of university when the Taliban banned classes. How I found a job in secret, . and how that job became one of the scariest decisions I made.
0
0
6
@boazbaraktcs
Boaz Barak
10 days
RT @davidmanheim: @boazbaraktcs Thank you!. We as a society need to normalize and support criticizing people we agree with for bad behavior….
0
2
0
@boazbaraktcs
Boaz Barak
10 days
RT @aymannadeem: I’m a YC/VC-backed founder. building is hard enough without tech cheering on open racism. saying vile things about musli….
0
155
0
@boazbaraktcs
Boaz Barak
10 days
BTW if I did live in New York I wouldn’t vote for the guy, but not because he lied on his college application but because his policies are terrible.
1
1
53