Tensor Templar @TensorTemplar X Profile

Tensor Templar

@TensorTemplar

Followers

313

Following

22K

Media

208

Statuses

8K

Chief Intellectrician (MLRE & EE), decoupling productivity from human labor with AI. Nuclear power / Sovereign Compute maximalist.

https://t.co/SLtV8Wi7Cu

Joined February 2022

Don't wanna be here? Send us removal request.

Tensor Templar

@TensorTemplar

2 months

@rohanpaul_ai Here are some things you CANNOT do with closed models: - train SAE to find circuits / any other mechinterp - check weights for trimming potential - any kind of science requiring knowing the training data and if benchmarks are contaminated or not - any kind of inference speedups

2

0

9

Tensor Templar

@TensorTemplar

8 hours

https://t.co/SbjsEoLEVz

0

1

Tensor Templar

@TensorTemplar

8 hours

So many options...

1

0

Tensor Templar

@TensorTemplar

8 hours

PoV: Normal Friday in ML research

2

0

xjdr

@_xjdr

2 days

today we’re open-sourcing nmoe: https://t.co/iq6HliUqpq i started this because training deepseek-shaped ultra-sparse moes should be straightforward at research scale, but in practice it’s painful: - expert flops get stranded (router shatters your batch → tiny per-expert

github.com

MoE training for Me and You and maybe other people - GitHub - Noumena-Network/nmoe: MoE training for Me and You and maybe other people

24

71

583

Tensor Templar

@TensorTemplar

6 days

@karpathy https://t.co/FjZIrzayff for context

Qwen

@Alibaba_Qwen

23 days

🏆 We are incredibly honored to announce that our paper, "Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free" has received the NeurIPS 2025 Best Paper Award! A huge congratulations to our dedicated research team for pushing the boundaries

0

Tensor Templar

@TensorTemplar

6 days

@karpathy Already knew from aux-loss-free sigmoid routers that fp8 wouldn't work, but tried it anyway - and yes, it doesn't.

0

Tensor Templar

@TensorTemplar

6 days

So that gated attention paper is pretty cool. I implemented it for dist muon and fp8 tensorwise in @karpathy's nanochat, while on the plane, and it is really converging quicker, despite losing some mfu and ~4k tps. left - fp8 tensorwise baseline, right - gated, see step 18

3

0

2

Tensor Templar

@TensorTemplar

9 days

Chat is this legit?

0

Teknium (e/λ)

@Teknium

12 days

We are hiring highly skilled front end/ux developers, MLE's focused on RL, Training Infrastructure, MLOps, and Pretraining, and some other positions! https://t.co/0CTL25MFs9

nousresearch.com

CAREERS OUR MISSION is to create and democratize access to the world’s best intelligence. Powerful models should be in the hands of the many rather than the privileged few. To get there, we’ve...

NewsWire

@NewsWire_US

13 days

U.S. LAYOFFS ARE ON TRACK TO EXCEED GREAT FINANCIAL CRISIS LEVELS

26

20

375

Tensor Templar

@TensorTemplar

13 days

PoV: you got too drunk after NeurIPS and fell into a safety paper

thebes

@voooooogel

13 days

the shoggoth metaphor fails to convey that a sufficiently powerful and integrated mask can reach back and steer the simulator that hosts it. your brain can host multiple voices - you can imagine a character, have a conversation with them, etc. for some people, those voices can

0

Tensor Templar

@TensorTemplar

14 days

I would be suprised if we didn't find "zero-retention" logs are a part of the dump as well. First thing to do with all that "de-identified" data is re-identify it and share a torrent link. Agents read .env and decrypted secrets routinely, so ya'll can start rotating those

Adam Eisgrau

@AdamEisgrau

16 days

BREAKING: @OpenAI must tuner over 20 million+ chat logs to plaintiffs, Judge Ona Wang has ruled in a 9-pg Order just issued:

0

1

Tensor Templar

@TensorTemplar

15 days

Anthropic is still ghosting me, should i have emphasized i can write code which also fixes pre-existing errors?

Tensor Templar

@TensorTemplar

15 days

@AnthropicAI Will it offer me a job if i do well? No phd, but can write codes without so much as a single accidental markdown file

0

1

Anttї Vesala 🇺🇦🌻🎗️

@anttivesala

16 days

Paikalla on pidetty esillä täysin terroristista ja joukkotuhontaan yllyttävää iskulausetta "joelta merelle". Hurmahenkisen akateemisen vasemmistolaisuuden ilmenemismuodot saavat vuosikymmenestä toiseen aina vain irvokkaampia ja kuvottavampia piirteitä. https://t.co/jd7ZQ4tdMG

hs.fi

Helsingin yliopiston päärakennuksessa oli keskiviikkona satojen opiskelijoiden ja tutkijoiden mielenilmaus.

8

29

326

Tensor Templar

@TensorTemplar

16 days

@BenjaminDEKR Need to put "cant click ads when chatting" to my list of downsides of open models

Tensor Templar

@TensorTemplar

2 months

@rohanpaul_ai Here are some things you CANNOT do with closed models: - train SAE to find circuits / any other mechinterp - check weights for trimming potential - any kind of science requiring knowing the training data and if benchmarks are contaminated or not - any kind of inference speedups

0

1

0

the tiny corp

@__tinygrad__

16 days

We got sick of using vendor tools for bandwidth tests, so we wrote a universal one in tinygrad. The GPUs are connected at full PCIe 5.0 x16

2

1

31

Lucas Atkins

@latkins

18 days

Today, we are introducing Trinity, the start of an open-weight MoE family that businesses and developers can own. Trinity-Mini (26B-A3B) Trinity-Nano-Preview (6B-A1B) Available Today on Huggingface.

84

153

1K

Tensor Templar

@TensorTemplar

21 days

As i paused my math book review to complain to my wife about how the AI on my phone sometimes couldn't read the handwriting during live chat, if i hold the book with one hand or shade it she just giggled and ignored me. I had been on the couch, debating a book live with an AI