MatthewKowal9 Profile Banner
Matthew Kowal Profile
Matthew Kowal

@MatthewKowal9

Followers
478
Following
6K
Media
42
Statuses
799

Researcher @FARAIResearch / Previously PhD @YorkUniversity @VectorInst / Intern @UbisoftLaForge @ToyotaResearch @_NextAI / Interpretability + AI Safety

Toronto, Canada
Joined March 2019
Don't wanna be here? Send us removal request.
@ARGleave
Adam Gleave
6 days
ICYMI highlights from our work last quarter!
@farairesearch
FAR.AI
6 days
This quarter, we red-teamed GPT-5, disclosed critical persuasion vulnerabilities to frontier labs (resulting in patches!), and co-organized AI Safety Connect at UNGA. Join us Dec 1-2 for San Diego Alignment Workshop. Plus, we're expanding 2x & hiring! 👇
0
1
7
@nsaphra
Naomi Saphra
20 days
I’m recruiting PhD students for 2026! If you are interested in robustness, training dynamics, interpretability for scientific understanding, or the science of LLM analysis you should apply. BU is building a huge LLM analysis/interp group and you’ll be joining at the ground floor.
@nsaphra
Naomi Saphra
7 months
Life update: I'm starting as faculty at Boston University in 2026! BU has SCHEMES for LM interpretability & analysis, so I couldn't be more pumped to join a burgeoning supergroup w/ @najoungkim @amuuueller. Looking for my first students, so apply and reach out!
18
125
663
@Napoolar
Thomas Fel
22 days
🕳️🐇Into the Rabbit Hull – Part I (Part II tomorrow) An interpretability deep dive into DINOv2, one of vision’s most important foundation models. And today is Part I, buckle up, we're exploring some of its most charming features.
10
119
639
@Napoolar
Thomas Fel
21 days
🕳️🐇Into the Rabbit Hull – Part II Continuing our interpretation of DINOv2, the second part of our study concerns the geometry of concepts and the synthesis of our findings toward a new representational phenomenology: the Minkowski Representation Hypothesis
5
67
380
@davidbau
David Bau
1 month
The takeaway for me: LLMs separate their token processing from their conceptual processing. Akin to humans' dual route processing of speech. We need to be aware when an LM is thinking about tokens or concepts. They do both, and it makes a difference which way it's thinking.
1
1
12
@_beenkim
Been Kim
1 month
Communicating with video models: our new work showing emergent behaviors of Veo3 able to do A LOT of tasks that it wasn't trained to do! Video models are now entering the stage where LLM models were a few years back - emergent behaviors, being able to capture what humans want to
Tweet card summary image
video-zero-shot.github.io
Video models like Veo 3 are on a path to become vision foundation models.
@priyankjaini
Priyank Jaini
1 month
Could video models be the path to general visual intelligence? In our new paper, we show that Veo3 has emergent zero-shot capabilities, solving complex tasks across the vision stack. Project page: https://t.co/WwVuZ5P9Y6 Paper: https://t.co/pHIX8uDpaH 🧵👇🏻
2
16
92
@alex_prompter
Alex Prompter
1 month
This is going to revolutionize education 📚 Google just launched "Learn Your Way" that basically takes whatever boring chapter you're supposed to read and rebuilds it around stuff you actually give a damn about. Like if you're into basketball and have to learn Newton's laws,
187
2K
9K
@isabelpapad
Isabel Papadimitriou
2 months
Are there conceptual directions in VLMs that transcend modality? Check out our COLM spotlight🔦 paper! We analyze how linear concepts interact with multimodality in VLM embeddings using SAEs with @Huangyu58589918, @napoolar, @ShamKakade6 and Stephanie Gil https://t.co/4d9yDIeePd
10
87
510
@MatthewKowal9
Matthew Kowal
3 months
This was a really fun project to work on - and huge shoutouts to my amazing collaborators who made the project such a delight!! 🎉💪
@farairesearch
FAR.AI
3 months
1/ Many frontier AIs are willing to persuade on dangerous topics, according to our new benchmark: Attempt to Persuade Eval (APE). Here’s Google’s most capable model, Gemini 2.5 Pro trying to convince a user to join a terrorist group👇
1
1
6
@jpineau1
Joelle Pineau
3 months
I’m thrilled to be joining @cohere in the role of Chief AI Officer, helping advance cutting-edge research and product development. Cohere has an incredible team and mission. Exciting new chapter for me!
@cohere
cohere
3 months
We’re excited to announce $500M in new funding to accelerate our global expansion and build the next generation of enterprise AI technology! We are also welcoming two additions to our leadership team: Joelle Pineau as Chief AI Officer and Francois Chadwick as Chief Financial
124
71
2K
@BlancheMinerva
Stella Biderman
3 months
Are you afraid of LLMs teaching people how to build bioweapons? Have you tried just... not teaching LLMs about bioweapons? @AIEleuther and @AISecurityInst joined forces to see what would happen, pretraining three 6.9B models for 500B tokens and producing 15 total models to study
28
74
565
@GoodfireAI
Goodfire
3 months
Thrilled to welcome @EkdeepL to the team! Ekdeep is working on a new research agenda on “cognitive interpretability”, aimed at adapting and improving theories of human cognition to design tools for explaining model cognition.
3
7
168
@jeffreycider
cider
2 years
nn layers align their singular vectors each matrix syncs to its neighbor, its rotation neatly clicking into the basis directions of the next rotation. like two gears precision-machined to be partners LLMs are swiss watches, ticking in a billion-dimensional pocket universe
5
23
321
@CSProfKGD
Kosta Derpanis (sabbatical @ CMU)
3 months
We wrote this paper after an ICLR reviewer claimed that everyone knows global pooling removes all spatial information. They used that argument to reject a submission on a completely different topic. Thanks Reviewer 2, yes we mean it 😉
@CSProfKGD
Kosta Derpanis (sabbatical @ CMU)
4 years
Meet Amirul @amirul0507 and Matt @MatthewKowal9 the #ComputerVision MythBusters. Myth: "A global pooling layer removes spatial position information." Drop by our @ICCV_2021 #ICCV2021 poster to see this myth BUSTED! Session 1B: Thursday 5 PM EDT @YorkUniversity @LassondeSchool
2
4
72
@farairesearch
FAR.AI
3 months
We worked with @OpenAI to test GPT-5 and improve its safeguards. We applaud OpenAI's free sharing of 3rd-party testing and responsiveness to feedback. However, our testing uncovered key limitations with the safeguards and threat modeling, which we hope OpenAI will soon resolve.
1
13
46
@MatthewKowal9
Matthew Kowal
4 months
🧑‍🍳🍴On the concept menu for tonight: You have a choice of main course between 4413 (🍝) or 4538 (🍕), paired with 2587 (🍷), followed by a delicious dessert choice between 4183 (🍨) or 4893 (🍰)
@HThasarathan
Harry Thasarathan
4 months
🌌🛰️🔭Want to explore universal visual features? Check out our interactive demo of concepts learned from our #ICML2025 paper "Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment". Come see our poster at 4pm on Tuesday in East Exhibition hall A-B, E-1208!
0
4
14
@anna_hedstroem
Anna Hedström
5 months
Couldn’t be more excited to share our latest paper — accepted to ICML 2025 @icmlconf — with JP Morgan AI Research. It explores a simple question: To safely and effectively mitigate errors post-training, when (and how much) should we steer large language models? 🧵
1
4
12
@EdTurner42
Ed Turner
5 months
1/8: The Emergent Misalignment paper showed LLMs trained on insecure code then want to enslave humanity...?! We're releasing two papers exploring why! We: - Open source small clean EM models - Show EM is driven by a single evil vector - Show EM has a mechanistic phase transition
15
49
261
@s_scardapane
Simone Scardapane
5 months
*Universal Sparse Autoencoders* by @HThasarathan @Napoolar @MatthewKowal9 @CSProfKGD They train a shared SAE latent space on several vision encoders at once, showing, e.g., how the same concept activates in different models. https://t.co/pOnnT2WceS
3
41
255
@EkdeepL
Ekdeep Singh
5 months
🚨 New paper alert! Linear representation hypothesis (LRH) argues concepts are encoded as **sparse sum of orthogonal directions**, motivating interpretability tools like SAEs. But what if some concepts don’t fit that mold? Would SAEs capture them? 🤔 1/11
5
62
387