BlackboxNLP @BlackboxNLP X Profile

BlackboxNLP

@BlackboxNLP

Followers

754

Following

53

Media

50

Statuses

107

The leading workshop on analysing and interpreting neural networks for NLP Co-located with EMNLP 2025 in Suzhou, China

https://t.co/nblTBt4Soi

EMNLP 2025

Joined May 2023

Don't wanna be here? Send us removal request.

Verna Dankers @ EMNLP25

@vernadankers

2 hours

One of the first papers I contributed to was published at #BlackboxNLP 📃, what an honor to, 6 years later, give a keynote at this wonderful venue 🧡 Thanks for having me! #EMNLP2025

BlackboxNLP

@BlackboxNLP

4 hours

Our second keynote of the day has just started! @vernadankers is now presenting "Memorization: Myth or Mystery?"

1

5

17

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Nicolò & Mingyang: Can we understand which circuits emerge in small models and reasoning-tuned systems, and how do they compare with default systems? Are there methods that generalize better across all tasks?

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Q: What's next for interpretability benchmarks? @MichalGolov: People sitting together and planning how to extend tests to multimodal, diverse contexts. @michaelwhanna: For circuit finding, integrating sparse features circuits could help us better understand our models.

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Nicolò: Starting to hack around on notebooks and public libraries can be very helpful to gain early intuitions about what's promising.

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @michaelwhanna: Don't try to read everything. Find Qs you really care about, and go a level deeper to answer meaningful questions.

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 Q: How would one go about approaching interpretability research these days? @MichalGolov: "When things don't work out of the box, it's a sign to double down and find out why. Negative results are important!"

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @dana_arad4: As deep learning research converges on similar architectures for different modalities, it will be interesting to determine which interpretability method will remain useful across various models and tasks.

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @michaelwhanna, Nicolò & Mingyang: Counterfactuals in minimal settings can be helpful, but they do not capture the whole story. Extending current methods to long contexts, and finding practical applications in safety-related areas are exciting challenges ahead.

1

0

BlackboxNLP

@BlackboxNLP

2 hours

@dana_arad4 @michaelwhanna @MichalGolov @mingyang2666 @MichalGolov: Mechanistic interpretability has heavily focused on toy tasks and text-only models. The next step is scaling to more challenging tasks involving real-world reasoning.

1

0

BlackboxNLP

@BlackboxNLP

3 hours

Our panel moderated by @dana_arad4 "Evaluating Interpretability Methods: Challenges and Future Directions" just started! Come to hear the takes of @michaelwhanna @MichalGolov Nicolò Brunello @mingyang2666!

1

5

14

BlackboxNLP

@BlackboxNLP

3 hours

Our last oral presentation is starting now, by Kentaro Ozeki, presenting "Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives"

0

1

4

BlackboxNLP

@BlackboxNLP

4 hours

Our second keynote of the day has just started! @vernadankers is now presenting "Memorization: Myth or Mystery?"

1

4

20

BlackboxNLP

@BlackboxNLP

7 hours

Circuit-tracer supports many models and has low memory requirements - it out!

0

1

BlackboxNLP

@BlackboxNLP

7 hours

Computing attribution graphs using transcoders can reveal interesting insights, which is made easy using circuit-tracer

1

0

3

BlackboxNLP

@BlackboxNLP

7 hours

Looking forward to see their future work!

0

1

2

BlackboxNLP

@BlackboxNLP

7 hours

By analyzing language clusters, they show that middle layers are more prone to cross-lingual overlap than early and late layers

1

2

Gabriele Sarti @ EMNLP 🇨🇳

@gsarti_

7 hours

Follow @BlackboxNLP for the live tweeting of the event!

BlackboxNLP

@BlackboxNLP

8 hours

Word cloud for this year's submissions! Excited to see so many interesting topics, and the growing interest in reasoning

0

4

11

BlackboxNLP

@BlackboxNLP

7 hours

Next up: Circuit-Tracer: A New Library for Finding Feature Circuits presented by @michaelwhanna

1

3

16

BlackboxNLP

@BlackboxNLP

7 hours

Nadav Shani is giving the first oral presentation of the day: Language Dominance in Multilingual Large Language Models

1

3

9

BlackboxNLP

@BlackboxNLP

8 hours

Quanshi Zhang is giving the first keynote of the day: Can Neural Network Interpretability Be the Key to Breaking Through Scaling Law Limitations in Deep Learning?

0

4

14