Explore tweets tagged as #TransCoders
Google DeepMind is releasing Gemma Scope 2: SAEs and transcoders on every layer of every Gemma 3 model, 270M-27B, base & chat. We hope this enables deep dives into complex model behavior, for more ambitious open source safety & interpretability work!
8
54
508
Circuit tracing with the Qwen3 Cross-layer Transcoders: A circuit-level analysis of how Qwen3 produces specific predictions https://t.co/qNtdlIjXrR
1
1
3
Today marks the first-ever release of Cross-Layer Transcoders for Qwen3. BluelightAI has trained CLTs for Qwen3-0.6B and 1.7B, creating an explorable set of interpretable features that capture how Qwen3 represents concepts and transforms information across its layers. The Qwen3
1
8
26
Sparse autoencoders (SAEs) have taken the interpretability world by storm over the past year or so. But can they be beaten? Yes! We introduce skip transcoders, and find they are a Pareto improvement over SAEs: better interpretability, and better fidelity to the model 🧵
15
77
524
I'm pretty proud of this: We trained cross-layer transcoders for Qwen3 and built a dashboard for exploring the features using TDA-based graph visualizations.
2
5
10
I gave a talk on my interpretability research building on top of anthropic’s cross layer transcoders and the pics came out kinda cute thx @nehadesaraju
14
11
542
Facebook and Twitter's transcoders don't handle color space very well. Anyone happen to have some contacts there?
1
0
3
Missed this one 💤💤SAEs focus on compressing activations and prioritizing reconstruction over functional analysis which leads to it not being able to fully utilize the functional role of layers like MLP , to solve the limitations of SAEs Transcoders and skip Transcoders were
0
0
16
Anthropic developed a method to visualize how large language models generate outputs, revealing that Claude 3.5 Haiku appears to perform implicit reasoning, even without explicit instruction to do so. By replacing fully connected layers with cross-layer transcoders, the team
11
49
267
I'm excited to release Gemma Scope 2: a comprehensive set of interpretability tools on Gemma 3. SAEs & transcoders on every layer of every model! Gemma 3 27B shows lots of rich safety-relevant behaviour I want to enable deep dives into what's really going on Check out our demo!
Google DeepMind is releasing Gemma Scope 2: SAEs and transcoders on every layer of every Gemma 3 model, 270M-27B, base & chat. We hope this enables deep dives into complex model behavior, for more ambitious open source safety & interpretability work!
5
20
217
➡️Για να μη μένει κανείς με λανθασμένη εντύπωση ότι η αποκλειστική ευθύνη για το φιάσκο P-3 βαρύνει ΕΑΒ/🇬🇷 κυβερνήσεις: η LM «ξέχασε» ότι οι 70χρονοι T56 είναι αναλογικοί κ χρειάζονται transcoders για διασύνδεση με το ψηφιακό σύστημα ελέγχου πτήσης 🤡🤡🤡 https://t.co/bqyU62rUya
0
4
14
Still Paying the Cloud Encoding Tax? Let’s Fix That. NETINT’s ASIC transcoders deliver the same performance as GPU or CPU instances - at a fraction of the power and cost. Meet our engineers at InterBEE (booth #7409) 🔗 https://t.co/YUzgtMzchf
1
0
0
New post on Less Wrong: Cross Layer Transcoders for the Qwen3 LLM Family
0
1
5
Today the first paper I've read is "Transcoders Find Interpretable LLM Feature Circuits" by @jacobdunefsky and @pchlenski. In it they propose using Transcoders instead of SAEs for mechanistic interpretability purposes. Instead of a traditional SAE, a transcoder is not - (1/4)
2
0
1
Imagery from 16 November 2022 depicts eight 250T9-1 Iskander-M system transcoders. It is the first imagery from GE where you can noticed it in a brigade near Korienovsk. Before than brigade was armed with Tochka-U system.
2
0
1
Transcoders Find Interpretable LLM Feature Circuits abs: https://t.co/eM3tbcX6ez Alternative approach for model interpetability: transcoders. Very similar to sparse autoencoders, but SAEs learn to reconstruct model activations, whereas transcoders imitate sublayers’
3
9
43
This seems big Meta’s New Tech Lets Us See AI Thinkinga, and Catch Its Mistakes in Real Time Meta’s new Circuit based Reasoning Verification (CRV) lets researchers watch an AI’s thought process break in real time. By replacing core modules with transparent "transcoders" CRV
Meta just found a way to watch an AI's thought process break in real-time. Their new method cuts error rates by 68% and reduces false positives by 41%. This novel Circuit-based Reasoning Verification (CRV) opens new possibilities in reliable AI. Here's how it works: - X-Ray:
12
42
356