BlackboxNLP Profile Banner
BlackboxNLP Profile
BlackboxNLP

@BlackboxNLP

Followers
754
Following
53
Media
50
Statuses
107

The leading workshop on analysing and interpreting neural networks for NLP Co-located with EMNLP 2025 in Suzhou, China

EMNLP 2025
Joined May 2023
Don't wanna be here? Send us removal request.
@vernadankers
Verna Dankers @ EMNLP25
2 hours
One of the first papers I contributed to was published at #BlackboxNLP 📃, what an honor to, 6 years later, give a keynote at this wonderful venue 🧡 Thanks for having me! #EMNLP2025
@BlackboxNLP
BlackboxNLP
4 hours
Our second keynote of the day has just started! @vernadankers is now presenting "Memorization: Myth or Mystery?"
1
5
17
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Nicolò & Mingyang: Can we understand which circuits emerge in small models and reasoning-tuned systems, and how do they compare with default systems? Are there methods that generalize better across all tasks?
0
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Q: What's next for interpretability benchmarks? @MichalGolov: People sitting together and planning how to extend tests to multimodal, diverse contexts. @michaelwhanna: For circuit finding, integrating sparse features circuits could help us better understand our models.
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Nicolò: Starting to hack around on notebooks and public libraries can be very helpful to gain early intuitions about what's promising.
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @michaelwhanna: Don't try to read everything. Find Qs you really care about, and go a level deeper to answer meaningful questions.
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Q: How would one go about approaching interpretability research these days? @MichalGolov: "When things don't work out of the box, it's a sign to double down and find out why. Negative results are important!"
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @dana_arad4: As deep learning research converges on similar architectures for different modalities, it will be interesting to determine which interpretability method will remain useful across various models and tasks.
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @michaelwhanna, Nicolò & Mingyang: Counterfactuals in minimal settings can be helpful, but they do not capture the whole story. Extending current methods to long contexts, and finding practical applications in safety-related areas are exciting challenges ahead.
1
0
0
@BlackboxNLP
BlackboxNLP
2 hours
@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @MichalGolov: Mechanistic interpretability has heavily focused on toy tasks and text-only models. The next step is scaling to more challenging tasks involving real-world reasoning.
1
0
0
@BlackboxNLP
BlackboxNLP
3 hours
Our panel moderated by @dana_arad4 "Evaluating Interpretability Methods: Challenges and Future Directions" just started! Come to hear the takes of @michaelwhanna @MichalGolov Nicolò Brunello @mingyang2666!
1
5
14
@BlackboxNLP
BlackboxNLP
3 hours
Our last oral presentation is starting now, by Kentaro Ozeki, presenting "Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives"
0
1
4
@BlackboxNLP
BlackboxNLP
4 hours
Our second keynote of the day has just started! @vernadankers is now presenting "Memorization: Myth or Mystery?"
1
4
20
@BlackboxNLP
BlackboxNLP
7 hours
Circuit-tracer supports many models and has low memory requirements - it out!
0
0
1
@BlackboxNLP
BlackboxNLP
7 hours
Computing attribution graphs using transcoders can reveal interesting insights, which is made easy using circuit-tracer
1
0
3
@BlackboxNLP
BlackboxNLP
7 hours
Looking forward to see their future work!
0
1
2
@BlackboxNLP
BlackboxNLP
7 hours
By analyzing language clusters, they show that middle layers are more prone to cross-lingual overlap than early and late layers
1
1
2
@gsarti_
Gabriele Sarti @ EMNLP 🇨🇳
7 hours
Follow @BlackboxNLP for the live tweeting of the event!
@BlackboxNLP
BlackboxNLP
8 hours
Word cloud for this year's submissions! Excited to see so many interesting topics, and the growing interest in reasoning
0
4
11
@BlackboxNLP
BlackboxNLP
7 hours
Next up: Circuit-Tracer: A New Library for Finding Feature Circuits presented by @michaelwhanna
1
3
16
@BlackboxNLP
BlackboxNLP
7 hours
Nadav Shani is giving the first oral presentation of the day: Language Dominance in Multilingual Large Language Models
1
3
9
@BlackboxNLP
BlackboxNLP
8 hours
Quanshi Zhang is giving the first keynote of the day: Can Neural Network Interpretability Be the Key to Breaking Through Scaling Law Limitations in Deep Learning?
0
4
14