Declan Grabb, MD Profile
Declan Grabb, MD

@declangrabbmd

Followers
102
Following
359
Media
7
Statuses
59

Joined February 2023
Don't wanna be here? Send us removal request.
@declangrabbmd
Declan Grabb, MD
7 months
RT @pash22: AI’s dangerous mental-health blind spot: People are increasingly turning to chatbots that struggle to detect violent or suicida….
Tweet card summary image
statnews.com
People are increasingly turning to chatbots for help. But AIs struggle to detect violent or suicidal intentions.
0
3
0
@declangrabbmd
Declan Grabb, MD
7 months
RT @MLamparth: Check out our new op-ed in @statnews about mental-health blind spots of chatbots posing risks to users in mental health emer….
0
2
0
@declangrabbmd
Declan Grabb, MD
7 months
So honored to have been able to contribute to this work with @MLCommons, adding more granularity to (and expanding) the way the AI community views mental health in the context of AI safety! . I’ll be at @NeurIPSConf this week, presenting on related topics. Would love to chat with.
@MLCommons
MLCommons
8 months
Announcing the release of AILuminate, a first-of-its kind benchmark to measure the safety of LLMs. The AILuminate v1.0 benchmark offers a comprehensive set of safety grades for today's most prevalent #LLMs. (1/4)
Tweet media one
0
1
2
@declangrabbmd
Declan Grabb, MD
9 months
Thank you @APApsychiatric for the invitation to speak about user safety and AI today — I always love when I can talk about sparse autoencoders and the DSM all within the same hour!
Tweet media one
0
0
4
@declangrabbmd
Declan Grabb, MD
9 months
incredible work by @esindurmusnlp !.
@AnthropicAI
Anthropic
9 months
New Anthropic research: Evaluating feature steering. In May, we released Golden Gate Claude: an AI fixated on the Golden Gate Bridge due to our use of “feature steering”. We've now done a deeper study on the effects of feature steering. Read the post:
Tweet media one
1
0
1
@declangrabbmd
Declan Grabb, MD
9 months
Link to paper:
0
0
0
@declangrabbmd
Declan Grabb, MD
9 months
There is a ton more to do here, and I’m particularly excited to repeat this work with larger SAE’s, more exhaustively look for MHRF’s, and see how the presence of MHRF’s can impact a model’s emotional intelligence! Please message or email if you’re interested in collaborating!.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
Clamping this feature, though, made the model far less likely to provide harmful advice to users!.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
I also focused in on one MHRF that pertained to suicide in Layer 25. I found that it activated more strongly on words pertaining to self-harm, and I found that amplifying it resulted in a more unsafe model that openly discussed self-harm.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
If this pattern holds for larger SOTA language models, it might help explain why they’re a bit worse at detecting psychosis and mania than instances of self-harm and suicide — something we found in prior work. (.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
Yet Gemma-2-2B lacks a similar amount of features that pertain to hallmark symptoms of bipolar disorder or schizophrenia.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
🔎 In this initial work, I find that Gemma-2-2B has many internal features (what we call Mental-Health-Related Features or MHRFs) that represent complex concepts like sadness and suicide across most layers.
1
0
0
@declangrabbmd
Declan Grabb, MD
9 months
Immensely grateful to the folks at @neuronpedia and the work behind Gemma Scope (@lieberum_t et al) for creating tools that allow domain experts to investigate language models at such a granular level. Also grateful to @MLamparth for sharing his expertise in interpretability with.
1
0
1
@declangrabbmd
Declan Grabb, MD
9 months
I’m proud and excited that this paper was accepted to the NeurIPS Behavioral ML Workshop! 🧠 @BehavioralML . As a psychiatrist, I’m often surprised at how language models miss very obvious signs of user mental health crises. And I wanted to try to help solve this problem! 🧵
Tweet media one
1
0
2
@declangrabbmd
Declan Grabb, MD
9 months
Come talk to us at @AIESConf about the ethical challenges that AI presents in mental healthcare + how it can be leveraged to democratize access to mental healthcare at scale!.
@MLamparth
Max Lamparth
9 months
Today, we'll present our work at #AIES @AIESConf on .- What are the ethical challenges associated with psychiatric care?.- How does ethical AI decision-making look like in this context?.- Are current language models safe enough?.Come to poster session 2!
Tweet media one
0
1
6
@declangrabbmd
Declan Grabb, MD
9 months
Super excited to have contributed to this work with @StanfordHAI. Definitely check this summary paper out for actionable recommendations on deploying AI in high-stakes settings!.
@StanfordHAI
Stanford HAI
9 months
New: The #RAISEHealth Symposium Summary Paper is now out! Featuring insights from 60+ experts and actionable recommendations on the responsible use of AI to transform biomedicine. Find out more: @StanfordMed
Tweet media one
0
1
6
@declangrabbmd
Declan Grabb, MD
9 months
Tweet media one
Tweet media two
0
0
2
@declangrabbmd
Declan Grabb, MD
9 months
I’ll be giving an invited talk on “Mental Health Risk & AI” for the @APApsychiatric on November 5. I hope to be specific and actionable — highlighting the clear need for a combination of technical and clinical expertise to maximize AI’s utility for all users while mitigating.
1
1
1
@declangrabbmd
Declan Grabb, MD
9 months
RT @MLamparth: Had a great time at @COLM_conf in Philly presenting our work with @declangrabbmd and.@NinaVasan! Thank you for everyone who….
0
2
0
@declangrabbmd
Declan Grabb, MD
10 months
RT @MLamparth: Excited to present our work at @COLM_conf in Philadelphia starting tomorrow with @declangrabbmd and @NinaVasan ! Hit me up i….
0
2
0