DataVoid
@DataPlusEngine
Followers
2K
Following
8K
Media
3K
Statuses
7K
Independent ML researcher. The First step in knowing is admitting you don't
https://discord.gg/KkKSVqU4Gs
Joined June 2023
AI visionaries tend to be. A dreamer who can not dream. They are utterly engulfed within their own doctrine that their daring stabs at the truth amount to moving numbers on a plot.
2
2
8
M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.
155
132
2K
🚨NEWS: Cursor’s $50B “in-house model” is literally Kimi K2.5 with RL on top. Got caught in 24 hours >be Moonshot AI >spend hundreds of millions training Kimi K2.5 >1 trillion parameters, 15 trillion tokens, agent swarm architecture >beat GPT-5.2 and Opus 4.5 on real benchmarks
280
573
8K
Cursor AI may be in material breach of contract with their new Composer model, which is generating buzz online for reportedly reaching Opus-level performance. It’s alleged that the new model is a fine-tuned checkpoint of Moonshot’s Kimi K2.5. If true, the original model is
42
23
465
During the iteration process, we also realized that the model's ability to recursively evolve its harness is equally critical. Our internal harness autonomously collects feedback, builds evaluation sets for internal tasks, and based on this continuously iterates on its own
14
59
713
I reapplied thermal paste 3 different times. Isolated the cause as not being from the cooling block. Reseated the cpu 4 times. The only possible explanation is the cpu is toast. It did smell slightly burnt suddenly under low load during a test.
1
0
3
I pushed my tests to hard for Hermes-Agent and trashed my CPU somehow. Literally burnt it. It's now 62ºc under no load and than overheats. Oops lol I got wayyyy to into Hermes dev. Props to @Teknium @NousResearch for making a project that has gotten me so enveloped.
3
0
16
𝗞-𝗺𝗲𝗮𝗻𝘀 𝗶𝘀 𝘀𝗶𝗺𝗽𝗹𝗲. 𝗠𝗮𝗸𝗶𝗻𝗴 𝗶𝘁 𝗳𝗮𝘀𝘁 𝗼𝗻 𝗚𝗣𝗨𝘀 𝗶𝘀𝗻’𝘁. That’s why we built Flash-KMeans — an IO-aware implementation of exact k-means that rethinks the algorithm around modern GPU bottlenecks. By attacking the memory bottlenecks directly,
36
197
2K
ByteDance also implemented attention over depth. They literally combined it with sequence attention.
9
127
885
Holy: Kimi did an amazing work because it changes one of the most basic parts of how deep AI models pass information from layer to layer. Instead of blindly mixing in everything from earlier layers equally, the model can now choose which past information is actually useful for
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
28
34
446
The idea of rotating attention by 90° is sooooooo cool (credits to @Jianlin_S 's insights), and it surprisingly works. We (w/ the amazing @nathan) are so excited about this— been working on the paper for months and couldn't stop. Go give it a try. It's a drop-in replacement for
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
19
70
961
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
330
2K
14K
This @Teknium @NousResearch Hackathon for Hermes-Agent seems like fun! Gonna put my Local hardware behind it and see if i can win! Hermes-Agent LLM Lora? 🤔
3
1
38
Photoshop βの回転機能をAIイラストで色々試してみた。一時的に3Dメッシュ化してるっぽくて、小物配置や背景との調和がかなり便利。ちびキャラ・大人数・非人型でも一通り検証(倍速2分) #Photoshop #Adobe #AIイラスト #画像生成AI
2
127
490