Berkeley AI Research
@berkeley_ai
Followers
234K
Following
383
Media
41
Statuses
1K
We're graduate students, postdocs, faculty and scientists at the cutting edge of artificial intelligence research.
Berkeley, CA
Joined July 2017
New on @berkeley_ai blog by @seohong_park: divide-and-conquer value learning for off-policy RL. No TD bootstrapping. Scales to long horizons. https://t.co/VLspSfmRmW 🐻📄
bair.berkeley.edu
The BAIR Blog
0
8
28
We don't fully understand the preferences human feedback encodes, so training on it can be risky. We propose a method to automatically discover these preferences! We identify unsafe, contradictory, and subjective preferences, and improve model safety, eval, and personalization.
📣NEW PAPER! What's In My Human Feedback? (WIMHF) 🔦 Human feedback can induce unexpected/harmful changes to LLMs, like overconfidence or sycophancy. How can we forecast these behaviors ahead of time? Using SAEs, WIMHF automatically extracts these signals from preference data.
2
10
86
Does faith belong in the public square? At Truth & Liberty, we say yes. Follow us for bold conversations about faith, freedom, and family values.
2
1
19
📣NEW PAPER! What's In My Human Feedback? (WIMHF) 🔦 Human feedback can induce unexpected/harmful changes to LLMs, like overconfidence or sycophancy. How can we forecast these behaviors ahead of time? Using SAEs, WIMHF automatically extracts these signals from preference data.
8
35
209
Honored to be selected as an #AI2050 Senior Fellow by @SchmidtSciences, alongside inspiring researchers shaping AI for the benefit of humanity. Excited to continue advancing the work on AI, AI Safety & Security — building AI-powered systems that are safer, more trustworthy,
We're excited to welcome 28 new AI2050 Fellows! This 4th cohort of researchers are pursuing projects that include building AI scientists, designing trustworthy models, and improving biological and medical research, among other areas. https://t.co/8oY7xdhxvF
5
13
101
LLMs have dominated recent work on simulating human behaviors. But do you really need them? In discrete‑choice settings, our answer is: not necessarily. A lightweight graph neural network (GNN) can match or beat strong LLM-based methods. Paper: https://t.co/WvMRy4DdjR 🧵👇
3
14
53
our new work on controlling recsys with natural language, led by @MicahCarroll, and with great collaborators Addie Foote, @kjfeng_ , Marcus Williams, @ancadianadragan, @wbradknox
https://t.co/5mZdCDNZTP
arxiv.org
When users are dissatisfied with recommendations from a recommender system, they often lack fine-grained controls for changing them. Large language models (LLMs) offer a solution by allowing users...
5
7
41
What really matters in matrix-whitening optimizers (Shampoo/SOAP/PSGD/Muon)? We ran a careful comparison, dissecting each algorithm. Interestingly, we find that proper matrix-whitening can be seen as *two* transformations, and not all optimizers implement both. Blog:
5
48
326
Congratulations to all of our GRAMMY nominees! @tajmahalblues @kebmomusic @milesdavis @thebandGHOST @robertglasper @MissMargoPrice @TTChilders @imwithherband @Samantha_Fish @DeathBecomesBwy @danauerbach @hellswelles @billburr @DUCKWRTH @ConcordRecords @CraftRecordings
1
3
6
Autoregressive language models learn to compress data by mapping sequences to high-dimensional representations and decoding one token at a time. The quality of compression, as defined by the ability to predict the next token given a prompt, progressively improves (as measured by
4
22
79
LLMs have shown a remarkable ability to “self-refine” and learn from their mistakes via in-context learning. But in robotics, most methods are single-shot. How can we bring inference-time adaptation to robot learning? A 🧵:
10
18
129
New work from @aditya_oberai & @seohong_park: instead of 1-step TD backups or n-step, can we "divide and conquer" over the trajectory, backing up finer and finer increments? Improves over bias of TD-0 and variance of MC. Principle is old, but getting it to work takes some care!
TD Learning can suffer on long tasks: ↑ deep bellman recursions ��� ↓ poor scalability (despite big data) We introduce a new method (TRL) with a "divide-and-conquer" value update, which scales well with long horizons!
6
27
244
TD Learning can suffer on long tasks: ↑ deep bellman recursions → ↓ poor scalability (despite big data) We introduce a new method (TRL) with a "divide-and-conquer" value update, which scales well with long horizons!
2
29
231
🌍 LLMs can use long chain-of-thought (CoT) to reason in English, but what about other languages? New paper w/ @BerkeleyNLP: We study how scaling, pretraining, post-training & inference affect long CoT across 9 languages. Spoiler: English long CoT ≠ multilingual long CoT 🧵
3
8
22
Our new paper with Sonali Sharma and @RoxanaDaneshjou is out in @npjDigitalMed! We examine how medical safety and disclaimer messages in public LLMs have changed over time when answering patient questions.
Generative AI models are giving fewer medical disclaimers over time. 📉 In 2022, ~26% of AI health answers had a disclaimer. By 2025? <1%. As models get smarter, they’re getting less safe. Patients may take outputs as medical advice. https://t.co/2OYQvKdezT
3
8
17
AI can now see, reason, and segment the Earth. 🌍 Meet LISAt, our #NeurIPS2025 Datasets & Benchmarks paper - the first foundation model that turns language queries into pixel-level satellite segmentations. 🛰️ (1/n) 🔗 https://t.co/ApVZgGF0cU
@NeurIPSConf @berkeley_ai
4
3
29
Can a robot inspect all views of an object? Today @IROS, we present Omni-Scan from @berkeley_ai, a novel method for bimanual robo 360° object scanning & reconstruction using 3D Gaussian Splats. (1/8) 🔗 https://t.co/8emyJfUNk4
5
12
122
🧠 New preprint: How Do LLMs Use Their Depth? We uncover a “Guess-then-Refine” mechanism across layers - early layers predict high-frequency tokens as guesses; later layers refine them as context builds Paper - https://t.co/5PitHjmJJZ
@neuranna @GopalaSpeech @berkeley_ai
15
73
521
Catch up on our most recent Community Lecture: “Transmission Versus Truth: What Will It Take to Make an AI as Smart as a 4-Year-Old?” with Alison Gopnik. This was the last of six community lectures for 2025, and all are available to watch on SFI’s YouTube channel. Watch here:
1
12
27
New evaluation results from @AnthropicAI's Claude Sonnet 4.5’s system card on our CyberGym benchmark reveals a striking trend: AI cybersecurity capabilities are advancing at unprecedented speed—from ~10% (Claude-Sonnet -3.7) to ~30% success rates (Claude-Sonnet-4.5) (with single
1/ 🔥 AI agents are reaching a breakthrough moment in cybersecurity. In our latest work: 🔓 CyberGym: AI agents discovered 15 zero-days in major open-source projects 💰 BountyBench: AI agents solved real-world bug bounty tasks worth tens of thousands of dollars 🤖
7
15
56
Amazing! 10 @BerkeleyEECS @SkyCompLab grad students are Amazon AI PhD Fellows! Congrats! Learn more about our fellows here: https://t.co/zuCGKlmSNe
#AmazonAIFellowship
@BerkeleySky
eecs.berkeley.edu
Today, Amazon announced its new AI PhD Fellowship program, offering two years of funding to over 100 PhD students across nine universities. Ten of these inaugural fellowships have been awarded to...
🎓 Amazon launches AI PhD Fellowship program, providing $68 million over two years to fund PhD students at 9 universities pursuing research in machine learning, computer vision, and natural-language processing. #AmazonAIFellowship
0
14
59
Humans handle dynamic situations easily, what about models? Turns out, they break in three distinct ways: ⛔ Force Stop → Reasoning leakage (won’t stop) ⚡️ Speedup → Panic (rushed answers) ❓ Info Updates → Self-doubt (reject updates) 👉Check out https://t.co/wKrnsMkiFY
5
20
67