DataPlusEngine Profile Banner
DataVoid Profile
DataVoid

@DataPlusEngine

Followers
2K
Following
8K
Media
3K
Statuses
7K

Independent ML researcher. The First step in knowing is admitting you don't

https://discord.gg/KkKSVqU4Gs
Joined June 2023
Don't wanna be here? Send us removal request.
@DataPlusEngine
DataVoid
1 year
AI visionaries tend to be. A dreamer who can not dream. They are utterly engulfed within their own doctrine that their daring stabs at the truth amount to moving numbers on a plot.
2
2
8
@SkylerMiao7
Skyler Miao
2 days
M2.7 open weights coming in ~2 weeks. still actively iterating just updated a new version on yesterday — noticeably better on OpenClaw.
155
132
2K
@shapoco
シャポコ🌵
4 days
グーローシェーディングとデプスバッファを実装した #shapolab
69
875
7K
@LLMJunky
am.will
4 days
Someone posted this and I'm all the way ☠️☠️
@fynnso
Fynn
5 days
was messing with the OpenAI base URL in Cursor and caught this accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast so composer 2 is just Kimi K2.5 with RL at least rename the model ID
33
88
2K
@ns123abc
NIK
4 days
🚨NEWS: Cursor’s $50B “in-house model” is literally Kimi K2.5 with RL on top. Got caught in 24 hours >be Moonshot AI >spend hundreds of millions training Kimi K2.5 >1 trillion parameters, 15 trillion tokens, agent swarm architecture >beat GPT-5.2 and Opus 4.5 on real benchmarks
@elonmusk
Elon Musk
4 days
@fynnso Yeah, it’s Kimi 2.5
280
573
8K
@xw33bttv
Lex
4 days
Cursor AI may be in material breach of contract with their new Composer model, which is generating buzz online for reportedly reaching Opus-level performance. It’s alleged that the new model is a fine-tuned checkpoint of Moonshot’s Kimi K2.5. If true, the original model is
42
23
465
@MrCatid
catid
5 days
I was the first person to order the GB300 computer just found out
6
1
44
@MiniMax_AI
MiniMax (official)
6 days
During the iteration process, we also realized that the model's ability to recursively evolve its harness is equally critical. Our internal harness autonomously collects feedback, builds evaluation sets for internal tasks, and based on this continuously iterates on its own
14
59
713
@DataPlusEngine
DataVoid
6 days
I reapplied thermal paste 3 different times. Isolated the cause as not being from the cooling block. Reseated the cpu 4 times. The only possible explanation is the cpu is toast. It did smell slightly burnt suddenly under low load during a test.
1
0
3
@DataPlusEngine
DataVoid
6 days
I pushed my tests to hard for Hermes-Agent and trashed my CPU somehow. Literally burnt it. It's now 62ºc under no load and than overheats. Oops lol I got wayyyy to into Hermes dev. Props to @Teknium @NousResearch for making a project that has gotten me so enveloped.
3
0
16
@HaochengXiUCB
Haocheng Xi
8 days
𝗞-𝗺𝗲𝗮𝗻𝘀 𝗶𝘀 𝘀𝗶𝗺𝗽𝗹𝗲. 𝗠𝗮𝗸𝗶𝗻𝗴 𝗶𝘁 𝗳𝗮𝘀𝘁 𝗼𝗻 𝗚𝗣𝗨𝘀 𝗶𝘀𝗻’𝘁. That’s why we built Flash-KMeans — an IO-aware implementation of exact k-means that rethinks the algorithm around modern GPU bottlenecks. By attacking the memory bottlenecks directly,
36
197
2K
@rosinality
Rosinality
7 days
ByteDance also implemented attention over depth. They literally combined it with sequence attention.
9
127
885
@scaling01
Lisan al Gaib
8 days
Leanstral is part of the Mistral Small 4 family
@scaling01
Lisan al Gaib
8 days
Some math prover model by Mistral? link is dead again, just got the notif
5
5
86
@jtydhr88
jtydhr88
9 days
Tried to recreate PS’s image rotation feature inside ComfyUI - 2
9
19
144
@andrew_n_carr
Andrew Carr 🤸
8 days
If you can internalize what this plot means, how to make it, and why it's important you can get a job at any top lab
@Kimi_Moonshot
Kimi.ai
8 days
Scaling law experiments reveal a consistent 1.25× compute advantage across varying model sizes.
19
9
550
@kimmonismus
Chubby♨️
8 days
Holy: Kimi did an amazing work because it changes one of the most basic parts of how deep AI models pass information from layer to layer. Instead of blindly mixing in everything from earlier layers equally, the model can now choose which past information is actually useful for
@Kimi_Moonshot
Kimi.ai
8 days
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
28
34
446
@yzhang_cs
Yu Zhang 🐙🌘
8 days
The idea of rotating attention by 90° is sooooooo cool (credits to @Jianlin_S 's insights), and it surprisingly works. We (w/ the amazing @nathan) are so excited about this— been working on the paper for months and couldn't stop. Go give it a try. It's a drop-in replacement for
@Kimi_Moonshot
Kimi.ai
8 days
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
19
70
961
@Kimi_Moonshot
Kimi.ai
8 days
Introducing 𝑨𝒕𝒕𝒆𝒏𝒕𝒊𝒐𝒏 𝑹𝒆𝒔𝒊𝒅𝒖𝒂𝒍𝒔: Rethinking depth-wise aggregation. Residual connections have long relied on fixed, uniform accumulation. Inspired by the duality of time and depth, we introduce Attention Residuals, replacing standard depth-wise recurrence with
330
2K
14K
@DataPlusEngine
DataVoid
10 days
This @Teknium @NousResearch Hackathon for Hermes-Agent seems like fun! Gonna put my Local hardware behind it and see if i can win! Hermes-Agent LLM Lora? 🤔
3
1
38
@BeamManP
ビームマンP ver40
11 days
Photoshop βの回転機能をAIイラストで色々試してみた。一時的に3Dメッシュ化してるっぽくて、小物配置や背景との調和がかなり便利。ちびキャラ・大人数・非人型でも一通り検証(倍速2分) #Photoshop #Adobe #AIイラスト #画像生成AI
2
127
490