Dhruv π @dhruv31415 X Profile

Dhruv π

@dhruv31415

Followers

548

Following

2K

Media

64

Statuses

247

On the other side of the fence @tilderesearch

https://t.co/XOt7gWOgSd

Palo Alto, CA

Joined November 2023

Don't wanna be here? Send us removal request.

Dhruv π

@dhruv31415

4 months

Some exciting stuff since then!

Tilde

@tilderesearch

4 months

We’re excited to announce that Tilde completed an $8M seed round earlier this year, led by Khosla Ventures. Understanding model intelligence is the most important problem in the world, and the key to actualizing the promise that ASI can offer. 🧵 A thread on our approach:

0

17

Dhruv π

@dhruv31415

12 days

It'll converge to a better test loss I promise

0

11

Zongyu Lin

@zy27962986

12 days

🚀Really excited to see this amazing arch change (KDA) finally coming out! Replacing global attention with linear hybrid arch: better pretraining ppls, long context evals, downstream math&code&stem evals after RL, >6 * throughput at 1M to unblock more downstream potentials to

Kimi.ai

@Kimi_Moonshot

13 days

Kimi Linear Tech Report is dropped! 🚀 https://t.co/LwNB2sQnzM Kimi Linear: A novel architecture that outperforms full attention with faster speeds and better performance—ready to serve as a drop-in replacement for full attention, featuring our open-sourced KDA kernels! Kimi

1

18

55

Rui-Jie (Ridger) Zhu

@RidgerZhu

13 days

Thrilled to release new paper: “Scaling Latent Reasoning via Looped Language Models.” TLDR: We scale up loop language models to 2.6 billion parameters, and pretrained on > 7 trillion tokens. The resulting model is on par with SOTA language models of 2 to 3x size.

20

137

627

Dhruv π

@dhruv31415

14 days

Main innovation seems to be a row-wise forget gate. Cool!

elie

@eliebakouch

15 days

Kimi Delta Attention PR in FLA, very nice @yzhang_cs and team, i'm sooo excited for this model

2

1

36

elie

@eliebakouch

15 days

Kimi Delta Attention PR in FLA, very nice @yzhang_cs and team, i'm sooo excited for this model

elie

@eliebakouch

15 days

OMG, I'M SO HYPE

4

6

85

jianlin.su

@Jianlin_S

16 days

Low-precision attention may suffer from biased rounding errors https://t.co/0hxHG3tPu2

1

14

144

Tilde

@tilderesearch

16 days

5 days left! 🎃

Tilde

@tilderesearch

29 days

Today we're very happy to announce that we’re launching the Tilde Fellowship Program to support research in a mechanistic understanding of pre-training science (arch, optimizers, learning dynamics, etc.). Much of modern ML progress has come from scaling models and empirically

2

1

27

Tina Mai

@tinabmai

19 days

i’ve been deeply obsessed with the question of how to make humans less fragile. several months ago i decided to leave Stanford to research and deploy the biological machine learning methods that can get us closer. can finally share that i’ve been on the founding team

Valthos

@ValthosTech

19 days

Valthos builds next-generation biodefense. Of all AI applications, biotechnology has the highest upside and most catastrophic downside. Heroes at the frontlines of biodefense are working every day to protect the world against the worst case. But the pace of biotech is against

63

26

443

Abhay

@Aboozle

18 days

NousCon last night was a massive success! Thank you to everyone who showed out for our biggest event of the year. The future of open source AI is incredibly bright. S/o @rosstaylor90 and @dhruv31415 for coming to speak, @poetengineer__ and @johnkarborn for their epic live art

13

14

188

Nous Research

@NousResearch

23 days

art, drinks, open source ai w.s.g. tilde research and general reasoning oct. 24th, SF, 6p

38

28

663

Dhruv π

@dhruv31415

22 days

My feed in the past few days has become dominated by random Japanese ecology accounts. Never have I been happier to open this app.

でんか@『海のあかちゃん』『ヤドカリ探索図鑑』

@K_theHermit

24 days

いきもにあで「磯遊びはいいぞ」って100回くらい言った。磯遊びはいいぞ。

0

4

Mason Wang

@masonwang025

25 days

(1/2) i felt like no one actually teaches you a good framework for how to read (ML) papers well + fast, so i wrote this 5-minute read tldr: because so many papers suck, here's how to go through them quickly and revisit the good ones

28

208

2K

Dhruv π

@dhruv31415

29 days

It's crazy how many interesting questions there are to ask - and how many of them don't get solved in a world where more researchers don't get access to resources.

Tilde

@tilderesearch

29 days

Today we're very happy to announce that we’re launching the Tilde Fellowship Program to support research in a mechanistic understanding of pre-training science (arch, optimizers, learning dynamics, etc.). Much of modern ML progress has come from scaling models and empirically

1

0

16

Dhruv π

@dhruv31415

29 days

Weakest SignSGD user >>>

0

4

Dhruv π

@dhruv31415

30 days

Really cool generalization of Manifold Muon, awesome work from @SolidlySheafy

Tilde

@tilderesearch

30 days

Modern optimizers can struggle with unstable training. Building off of Manifold Muon, we explore more lenient mechanisms for constraining the geometry of a neural network's weights directly through their Gram matrix 🧠 A 🧵… ~1/6~

0

10

Yifei Zuo

@YifeiZuoX

1 month

[1/N] How can we make attention more powerful—not just more efficient? How do different attention mechanisms handle associative memory, and can we design a better one from first principles? 🤔 Our new work explores these questions by introducing Local Linear Attention (LLA).

3

35

213

Dhruv π

@dhruv31415

1 month

wat

0

6

Dhruv π

@dhruv31415

1 month

https://t.co/ac21Ac4fDR

0

10

William Merrill

@lambdaviking

1 month

My thesis, 𝘈 𝘵𝘩𝘦𝘰𝘳𝘺 𝘰𝘧 𝘵𝘩𝘦 𝘤𝘰𝘮𝘱𝘶𝘵𝘢𝘵𝘪𝘰𝘯𝘢𝘭 𝘱𝘰𝘸𝘦𝘳 𝘢𝘯𝘥 𝘭𝘪𝘮𝘪𝘵𝘢𝘵𝘪𝘰𝘯𝘴 𝘰𝘧 𝘭𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘮𝘰𝘥𝘦𝘭𝘪𝘯𝘨 𝘢𝘳𝘤𝘩𝘪𝘵𝘦𝘤𝘵𝘶𝘳𝘦𝘴, is now online:

8

46

386

Yulu Gan

@yule_gan

1 month

Reinforcement Learning (RL) has long been the dominant method for fine-tuning, powering many state-of-the-art LLMs. Methods like PPO and GRPO explore in action space. But can we instead explore directly in parameter space? YES we can. We propose a scalable framework for

90

390

3K