Explainable AI Profile
Explainable AI

@XAI_Research

Followers
2K
Following
616
Media
14
Statuses
1K

Moved to πŸ¦‹! Explainable/Interpretable AI researchers and enthusiasts - DM to join the XAI Slack! Twitter and Slack maintained by @NickKroeger1

Joined March 2022
Don't wanna be here? Send us removal request.
@XAI_Research
Explainable AI
4 years
There's a new XAI Slack! Connect with XAI/IML researchers and enthusiasts from around the world. Discuss interpretability methods, get help on challenging problems, and meet experts in your field! DM to join πŸ₯³
21
11
48
@Suuraj
Suraj Srinivas
11 months
Our Theory of Interpretable AI ( https://t.co/e9914pRv7y) will soon celebrate its one-year anniversary! πŸ₯³ As we step into our second year, we’d love to hear from you! What papers would you like to see discussed in our seminar in the future? πŸ“šπŸ” @tverven @ML_Theorist
tverven.github.io
The Theory of Interpretable AI Seminar is an international online seminar about the theoretical foundations of interpretable and explainable AI.
1
4
17
@ArchikiPrasad
Archiki Prasad
11 months
🚨 Excited to share: "Learning to Generate Unit Tests for Automated Debugging" 🚨 which introduces ✨UTGen and UTDebug✨ for teaching LLMs to generate unit tests (UTs) and debugging code from generated tests. UTGen+UTDebug improve LLM-based code debugging by addressing 3 key
5
62
166
@hima_lakkaraju
π™·πš’πš–πšŠ π™»πšŠπš”πš”πšŠπš›πšŠπš“πšž
11 months
Super excited to share our latest preprint that unifies multiple areas within explainable AI that have been evolving somewhat independently: 1. Feature Attribution 2. Data Attribution 3. Model Component Attribution (aka Mechanistic Interpretability) https://t.co/Sr9gvMDxoG
2
18
137
@XAI_Research
Explainable AI
11 months
Reminder we have moved to πŸ¦‹ Stay up to date with the latest XAI research!
1
1
12
@XAI_Research
Explainable AI
11 months
Reminder we have moved to πŸ¦‹ Stay up to date with the latest XAI research!
1
1
12
@XAI_Research
Explainable AI
11 months
We have moved to πŸ¦‹ bluesky! Please follow over there @ XAI-Research https://t.co/PEc4tAu96R
0
2
3
@_cagarwal
Chirag Agarwal
11 months
Exciting opportunity at the intersection of climate science and XAI to work on groundbreaking research in attributing extreme precipitation events with multimodal models. Check out the details and help spread the word! #ClimateAI #Postdoc #UVA #Hiring Job description:
@AntoniosMamala2
Antonios Mamalakis
11 months
Dear Climate and AI community! We are hiring πŸ˜€ a postdoc to join @UVAEnvironment at @UVA and work with @_cagarwal and myself, on using multimodal AI models and explainable AI to attribute extreme precipitation events! Fascinating stuff! Link below. Please RT!
0
7
17
@XAI_Research
Explainable AI
11 months
We have moved to πŸ¦‹ bluesky! Please follow over there @ XAI-Research https://t.co/PEc4tAu96R
0
2
3
@miv_cvpr2025
Mechanistic Interpretability for Vision @ CVPR2025
11 months
πŸ” Curious about what's really happening inside vision models? Join us at the First Workshop on Mechanistic Interpretability for Vision (MIV) at @CVPR! πŸ“’ Website: https://t.co/Ynpv1osH0t Meet our amazing invited speakers! #CVPR2025 #MIV25 #MechInterp #ComputerVision
0
13
58
@rgilman33
Rudy Gilman
11 months
The later features in DINO-v2 are more abstract and semantically meaningful than I'd expected from the training objectives. This neuron responds only to hugs. Nothing else, just hugs.
9
64
556
@apartresearch
Apart Research
11 months
This week's Apart News brings you an *exclusive* interview with interpretability insider @myra_deng of @GoodfireAI & revisits our Sparse Autoencoders Hackathon which featured a memorable talk from @GoogleDeepMind's @NeelNanda5.
1
4
18
@giangnguyen2412
Giang Nguyen
11 months
@dylanjsam Hi Dylan, it reminds me of our paper where we also train a model (model 2) on the output of another black-box model (model 1). ultimately we find that combining the outputs of model 2 and model 1 helps improve the perf significantly. https://t.co/QY7XPpCMM0
openreview.net
Nearest neighbors (NN) are traditionally used to compute final decisions, e.g., in Support Vector Machines or k-NN classifiers, and to provide users with explanations for the model's decision. In...
0
1
3
@tverven
Tim van Erven
11 months
In case you missed it: here is the recording of @YishayMansour's talk about the ability of decision trees to approximate concepts: https://t.co/DERjJawP7R For upcoming talks, check out the seminar website:
tverven.github.io
The Theory of Interpretable AI Seminar is an international online seminar about the theoretical foundations of interpretable and explainable AI.
@ML_Theorist
Michal Moshkovitz
11 months
Happening now!
0
3
16
@rohanpaul_ai
Rohan Paul
11 months
LLMs are all circuits and patterns Nice Paper for a long weekend read - "A Primer on the Inner Workings of Transformer-based Language Models" πŸ“Œ Provides a concise intro focusing on the generative decoder-only architecture. πŸ“Œ Introduces the Transformer layer components,
4
49
261
@farairesearch
FAR.AI
11 months
@_cagarwal Follow us for AI safety insights https://t.co/hfntPPogY0 And watch the full video
0
5
17
@ML_Theorist
Michal Moshkovitz
11 months
This Thursday (in 3 days), @YishayMansour will discuss interpretable approximations β€” learning with interpretable models. Is it the same as regular learning? Attend the lecture to find out! πŸ’» Website: https://t.co/MPJzLcxNfI @Suuraj @tverven
tverven.github.io
The Theory of Interpretable AI Seminar is an international online seminar about the theoretical foundations of interpretable and explainable AI.
@ML_Theorist
Michal Moshkovitz
1 year
The theory of interpretable AI seminar is back after the holiday season! πŸŽ…πŸ€Ά Our next talk is next Thursday by Yishay Mansour who will talk about interpretable approximations πŸ’» Website: https://t.co/MPJzLcxNfI ⏰Date: 16 Jan @Suuraj @tverven @YishayMansour
0
3
17
@GoodfireAI
Goodfire
11 months
We're open-sourcing Sparse Autoencoders (SAEs) for Llama 3.3 70B and Llama 3.1 8B! These are, to the best of our knowledge, the first open-source SAEs for models at this scale and capability level.
11
120
712
@saprmarks
Samuel Marks
11 months
What can AI researchers do *today* that AI developers will find useful for ensuring the safety of future advanced AI systems? To ring in the new year, the Anthropic Alignment Science team is sharing some thoughts on research directions we think are important.
10
66
325
@NielKlug
Ercong Nie @ EMNLP
1 year
ACL Time @ Bangkok πŸ‡ΉπŸ‡­ Our GNNavi work will be presented in the poster session at 12:30 on Aug. 14 (Wed.). Welcome to drop by and exchange with us! Looking forward to talking with people, especially those who are interested in multilingual & low-resource & LLM interpretabilityπŸ€—
0
7
29