James Golden @James_R_Golden X Profile

James Golden

@James_R_Golden

Followers

78

Following

376

Media

10

Statuses

38

Arcadia Science

https://t.co/KcJTc04nR7

Joined June 2025

Don't wanna be here? Send us removal request.

James Golden

@James_R_Golden

21 days

How to transform UMAP from a black box into a glass box: by using a special type of deep network, we can now compute exact linear equivalents that reveal which features drive each point's position in the embedding. @ArcadiaScience [1/8]

2

40

220

James Golden

@James_R_Golden

4 days

Limitations: This is "pointwise" linear, not piecewise. The Jacobian is only valid at that exact input, move slightly in embedding space and it changes completely. Working on an efficient Lanczos method for long inputs. Code:

github.com

Equivalent Linear Mappings of Large Language Models - jamesgolden1/equivalent-linear-LLMs

0

James Golden

@James_R_Golden

4 days

Practical application: these Jacobian matrices work as steering operators. The detached Jacobian from "The Golden Gate" can be used to steer unrelated prompts toward that concept. The output is "Golden Gate Bridge" insertions into random text

1

0

James Golden

@James_R_Golden

4 days

The singular value spectra of the linear operators are extremely low rank. The right singular vectors map to input tokens, the left singular vectors decode back to output tokens. For "The bridge out of Marin is the", the left SVs decode to "bridge", "Golden", "highway", "exit"

1

0

James Golden

@James_R_Golden

4 days

This is like the approach used in the Elhage Nanda Olsson et al. "A Mathematical Framework for Transformer Circuits" for attention-only models, expanded to MLP and normalization blocks, using gradient detachment and the autograd Jacobian to compute the equivalent linear system.

1

0

James Golden

@James_R_Golden

4 days

Every operation in transformer decoders (attention, gated activations, normalization) can be written as A(x)·x, where A(x) is input-dependent. By "detaching" the gradient at the right places during inference, you freeze A(x) and get a pure linear system.

arxiv.org

Despite significant progress in transformer interpretability, an understanding of the computational mechanisms of large language models (LLMs) remains a fundamental challenge. Many approaches...

1

0

James Golden

@James_R_Golden

4 days

Paper in TMLR and poster at the NeurIPS Mechanistic Interpretability workshop: "Equivalent Linear Mappings of Large Language Models" LLMs like Gemma 3 12B can be mapped to an equivalent, interpretable linear system for any given input, with output embed recon error of 10^(-13)

1

0

2

Ryan York

@ryanayork

18 days

🧵Really excited to share a set of our recent pubs @ArcadiaScience where we make black box BioML models transparent. [1/7] https://t.co/IIo2ZePAvd

research.arcadiascience.com

Deep networks make accurate predictions, but their nonlinearity makes them a black box, hiding what they have learned. Here, we look inside the black box and analyze the exact relationships they...

1

6

10

George Sandler

@GeorgeSandler_

18 days

Check out our latest work at @ArcadiaScience on decomposing a neural network trained on Genotype-Phenotype data into interpretable and familiar quantitative genetics parameters.

research.arcadiascience.com

We tested equivalent linear mapping (ELM) on a neural network trained to predict phenotypes from genotypes in simulated data. We show that ELM successfully recapitulates additive and epistatic...

1

6

9

James Golden

@James_R_Golden

20 days

@ArcadiaScience Note on the GIF: this is a slightly different encoder-decoder network where the loss function has terms for graph BCE, reconstruction and enc-dec-enc cyclic recon. The lattice is generated from the fully trained decoder and then encoded with each saved checkpoint for the GIF

0

3

James Golden

@James_R_Golden

21 days

@ArcadiaScience This approach works beyond genomics too: anywhere you use UMAP (images, protein embeddings, etc.), you can now extract exact feature attributions. Check out the notebook pub to try it [8/8]

1

7

James Golden

@James_R_Golden

21 days

@ArcadiaScience Comparing glass-box UMAP features to differential expression shows they're complementary, not identical. Many differentially expressed genes aren't the ones UMAP actually uses to separate clusters, as glass-box networks reveal what UMAP truly learned. [7/8]

1

2

7

James Golden

@James_R_Golden

21 days

@ArcadiaScience Coloring cells by their top gene contributor reveals structure that cell type labels alone don't show. Some top gene contributors extend across traditional cell type boundaries, showing how the embedding space is composed. [6/8]

1

0

4

James Golden

@James_R_Golden

21 days

@ArcadiaScience For the Luecken et al. human bone marrow gene expression dataset, we can now see exactly which genes contribute most to each cell's position. Some cell types have sub-regions driven by different genes — like Normoblasts, where HBD dominates one region and HBB another. [5/8]

1

9

James Golden

@James_R_Golden

21 days

@ArcadiaScience We validate that the Jacobian perfectly reconstructs the embedding as the reconstruction error approaches machine precision (~3e-14). This isn't an approximation like SHAP or LIME; it's the exact feature contribution. [4/8]

1

0

5

James Golden

@James_R_Golden

21 days

@ArcadiaScience With parametric UMAP, we can use certain deep networks (with zero-bias linear layers and ReLU activations) which are locally linear at each point. This means we can compute a Jacobian set of linear weights that exactly reconstruct the output for every data point [3/8]

0

1

James Golden

@James_R_Golden

21 days

@ArcadiaScience UMAP is everywhere because it's great at creating visually distinct clusters from high-dimensional data like gene expression. But there's a catch: the nonlinear mapping makes it hard to interpret which features are responsible for those clusters. https://t.co/IE1pSB6enH [2/8]

arcadia-science.github.io

2

5

17

Accepted papers at TMLR

@TmlrPub

28 days

Equivalent Linear Mappings of Large Language Models James Robert Golden. Action editor: Shay Cohen. https://t.co/wgS9QtmhGT #decoders #representations #transform

openreview.net

Despite significant progress in transformer interpretability, an understanding of the computational mechanisms of large language models (LLMs) remains a fundamental challenge. Many approaches...

0

1

James Golden

@James_R_Golden

30 days

Since this also works for Gemma 2, I plan to compare its equivalent linear representations to Gemma Scope SAE latents. My hope is that this could complement widely used approaches like SAEs and linear probes. I will have a poster on this at the NeurIPS mech interp workshop

0

James Golden

@James_R_Golden

30 days

The tradeoffs: computing the full Jacobian takes about 20 seconds per sequence for Qwen 14B, and most results are on short sequences (<10 tokens). I have a JAX Lanczos implementation for computing top-k singular vectors on Gemma 3 4B up to 400 tokens, which is more practical.

1

0