Explore tweets tagged as #CrossCoder
@leedsharkey
Lee Sharkey
7 months
And the method lets us identify computations that are spread across multiple layers. This has been conceptually challenging for the SAE paradigm to overcome. (Crosscoder features aren't the computations themselves, but are more akin to the results of the computations).
Tweet media one
Tweet media two
1
1
26
@Butanium_
Clément Dumas
4 months
Our analysis confirms these aren't just theoretical concerns! Looking at L1 crosscoder, we found:.- Many "chat-only" latents (blue) show high Shrinkage values.- Clear overlap between chat-only and shared latents (orange). Most "chat-only" latents aren't actually chat-specific!
Tweet media one
1
1
5
@Butanium_
Clément Dumas
4 months
We identified two theoretical issues with L1 crosscoders:.1️⃣ Complete Shrinkage: L1 regularization might force base latents to zero even when useful.2️⃣ Latent Decoupling: "Chat-only" concepts might actually exist in the base model but be encoded in the crosscoder differently
Tweet media one
1
0
10
@btibor91
Tibor Blaho
6 months
Anthropic researchers published new insights on "Crosscoder Model Diffing", showing that model-exclusive features tend to be harder to interpret due to competition for feature space, and proposing a method to make them more understandable
Tweet media one
2
1
58
@BazzaBulldog
Barry Webster
6 years
@CrossCodersCo @lozsparky11 - here’s some feedback from a very happy mum who’s daughter attended the first of two CrossCoder training sessions.
Tweet media one
1
1
1
@jxmnop
jack morris
2 months
just learned about "model diffing" from Anthropic. buried in an october blogpost; feels really novel. training a 'crosscoder' between two models of the same family produces interpretable diffs. here post-training clearly adds refusals, QA, math, etc. pretty amazing stuff
Tweet media one
11
32
723
@a_karvonen
Adam Karvonen
4 months
Very cool paper. They make a compelling case against the typical crosscoder per-model norm loss and show a simpler method (BatchTopK) gets better results. I really liked this figure, which shows why the crosscoder loss leads to an illusion of many model specific features.
Tweet media one
@Butanium_
Clément Dumas
4 months
New paper w/@jkminder & @NeelNanda5! What do chat LLMs learn in finetuning?. Anthropic introduced a tool for this: crosscoders, an SAE variant. We find key limitations of crosscoders & fix them with BatchTopK crosscoders. This finds interpretable and causal chat-only features! 🧵
Tweet media one
1
4
34
@Connor_Kissane
Connor Kissane
10 months
Open source replication of @AnthropicAI's Crosscoders paper (@Jack_W_Lindsey et al) for model-diffing!. We trained a crosscoder to model-diff the middle layer residual stream of Gemma-2 2B base and IT. The results hold up: we find shared, base-specific, and IT-specific latents.
Tweet media one
4
15
137
@btibor91
Tibor Blaho
6 months
0
0
15
@ai_hakase_
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
6 months
【Crosscoder Model Diffingの洞察】.✎. FYIG: Anthropicの研究者の方々が、"Crosscoder Model Diffing"に関する新たな知見を発表されたそうです!✨.
Tweet media one
0
0
2
@GhostsSoup
GhostsSoup
4 years
CrossCoder helped out with fixing the textures. Just gotta fix up the rig then edit the UVs and we should be good to go! Gonna also make an Overworld version too!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
3
14
@gm8xx8
𝚐𝔪𝟾𝚡𝚡𝟾
4 months
Crosscoder-based model diffing tracks concept shifts in LLM fine-tuning. Standard L1 loss misattributes shared latents as fine-tuned-specific. Latent Scaling detects these errors; BatchTopK loss improves separation. On Gemma 2B, the method isolates interpretable, chat-specific
Tweet media one
1
0
6
@CodingMaterials
Coding Materials
7 years
Save time, eliminate confusion, and improve coding #proficiency! #Book your copy of #Procedural #CrossCoder-2019 today and avail upto 30% discount.
Tweet media one
0
1
1
@CodingMaterials
Coding Materials
6 years
Couldn't buy Procedural #CrossCoder - eBook earlier?? Here is a chance for you to buy one at for $179.95 ONLY!! Buy any Coding Book, #Ebook or any other resource and avail 15% discount!. Order Today:
Tweet media one
0
0
2
@ai_hakase_
ハカセ アイ(Ai-Hakase)🐾最新トレンドAIのためのX 🐾
6 months
【Crosscoder Model Diffingの深層に迫る!】.Crosscoder Model Diffingに関する新たな考察が発表されました!✨. Siddharth Mishra-Sharma氏らによる研究で、Crosscoder Model Diffingにおいて、一方のモデルに特有のfeatureが多義的で解釈が難しい…😵という現象を調査したそうです。
Tweet media one
1
0
0
@CodingMaterials
Coding Materials
8 years
Radiology #CrossCoder #eBook 2018 available at affordable price, from coding materials.
Tweet media one
0
0
0
@CodingMaterials
Coding Materials
8 years
#Radiology #CrossCoder - #EBook 2018 at Unimaginable Price. Read More 👉👉
Tweet media one
0
0
0
@CodingMaterials
Coding Materials
8 years
Simplify and speed up the coding process with this one-Stop #Radiology #CrossCoder - EBOOK-2018 available at
Tweet media one
0
0
0
@CodingMaterials
Coding Materials
7 years
Exclusive offer on #Procedural #CrossCoder 2019 only available $147.56 with 25% discount. Hurry Up book now at
Tweet media one
0
0
0