myra_deng Profile Banner
Myra Deng Profile
Myra Deng

@myra_deng

Followers
1K
Following
954
Media
10
Statuses
98

aligning models @goodfireAI, prev @stanford and @twosigma

Joined January 2018
Don't wanna be here? Send us removal request.
@myra_deng
Myra Deng
1 hour
RT @GoodfireAI: Arc Institute trained their foundation model Evo 2 on DNA from all domains of life. What has it learned about the natural w….
0
16
0
@myra_deng
Myra Deng
1 day
RT @GoodfireAI: Adversarial examples - a vulnerability of every AI model, and a “mystery” of deep learning - may simply come from models cr….
0
24
0
@myra_deng
Myra Deng
6 days
RT @GoodfireAI: New research! Post-training often causes weird, unwanted behaviors that are hard to catch before deployment because they on….
0
44
0
@myra_deng
Myra Deng
14 days
Really excited for this! @banburismus_ @Jack_W_Lindsey talk interp with @whoisnnamdi.
@lightspeedvp
Lightspeed
14 days
🗓️ Mark your calendars for August 26 and join us for a #GenSF meetup covering mechanistic interpretability in modern AI models. Our Partner @whoisnnamdi will moderate a fireside chat with leaders from Lightspeed-backed companies with @AnthropicAI’s Researcher @Jack_W_Lindsey
Tweet media one
0
0
16
@myra_deng
Myra Deng
19 days
RT @jack_merullo_: Could we tell if gpt-oss was memorizing its training data? I.e., points where it’s reasoning vs reciting? We took a quic….
0
50
0
@myra_deng
Myra Deng
27 days
how how.it started it’s going. @GoodfireAI
Tweet media one
Tweet media two
3
8
145
@myra_deng
Myra Deng
28 days
RT @GoodfireAI: AI already accelerates materials R&D, but understanding what models learn about structure-property relationships could yiel….
0
9
0
@myra_deng
Myra Deng
1 month
RT @zhengdongwang: I wrote some fiction in the style of AI 2027. It combines the parts of AI 2027 and AI as Normal Technology that resonat….
0
8
0
@myra_deng
Myra Deng
1 month
Tweet media one
@ericho_goodfire
Eric Ho
1 month
Just wrote a piece on why I believe interpretability is AI’s most important frontier - we're building the most powerful technology in history, but still can't reliably engineer or understand our models. With rapidly improving model capabilities, interpretability is more urgent,
Tweet media one
0
0
16
@myra_deng
Myra Deng
2 months
RT @banburismus_: we discovered the katy parity feature.
0
4
0
@myra_deng
Myra Deng
2 months
Brutal roast from this UMAP of language model latents
Tweet media one
1
0
22
@myra_deng
Myra Deng
2 months
happy July fourth from me and mine (my Llama SAE features)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
2
52
@myra_deng
Myra Deng
2 months
I knew Claude was from New England.
@erikphoel
Erik Hoel
2 months
Always nice to hear more about Claude's personal life
Tweet media one
0
0
14
@myra_deng
Myra Deng
2 months
Has anyone used any good AI shopping assistants? I’ve tried @shopondaydream, deep research but couldn’t get either to work for me. Maybe this is a sign to stop buying clothes :/.
1
1
10
@myra_deng
Myra Deng
2 months
RT @GoodfireAI: (1/7) New research: how can we understand how an AI model actually works? Our method, SPD, decomposes the *parameters* of n….
0
87
0
@myra_deng
Myra Deng
2 months
RT @leedsharkey: A few months ago, we published . Attribution-based parameter decomposition -- a method for decomposing a network's paramet….
0
22
0
@myra_deng
Myra Deng
3 months
RT @GoodfireAI: New research update! We replicated @AnthropicAI's circuit tracing methods to test if they can recover a known, simple trans….
0
53
0
@myra_deng
Myra Deng
3 months
>be you.>work in HFT .>have existential dread .>see this tweet, wonder if your skills could be better used to make AGI safe.>apply to attend our happy hour, meet the @GoodfireAI team.>build safe AGI.
@sama
Sam Altman
4 months
>be you.>work in HFT shaving nanoseconds off latency or extracting bps from models.>have existential dread.>see this tweet, wonder if your skills could be better used making AGI.>apply to attend this party, meet the openai team.>build AGI.
5
14
284
@myra_deng
Myra Deng
3 months
do you guys like my coffee table book
Tweet media one
5
0
72