
DatologyAI
@datologyai
Followers
2K
Following
173
Media
29
Statuses
130
DatologyAI builds tools to automatically select and optimize the best data on which to train AI models, leading to better, smaller models which train faster.
Redwood City, CA
Joined September 2023
RT @leavittron: Very excited to announce BeyondWeb, @datologyAI’s synthetic pretraining data generation paradigm. BeyondWeb is a rephrasing….
0
41
0
RT @pratyushmaini: 1/Pretraining is hitting a data wall; scaling raw web data alone leads to diminishing returns. Today @datologyai shares….
0
124
0
RT @code_star: Training efficiency is hard, but getting easier to manage all the time. You can rent high speed interconnected h100s on dema….
0
4
0
RT @code_star: We are looking for a post-training lead at @datologyai . we have gpus, you can make them go brrrr
0
18
0
The joy of research is in sharing it. And asking the hard questions together. Here’s to a summer of curiosity, great conversations, and rabbit holes we didn't expect to fall into 🚀. Stay data-obsessed!.
blog.datologyai.com
We're hosting a weekly data seminar series at Datology AI featuring fun and thoughtful researchers pushing the boundaries of pretraining and data curation. Are you data-obsessed yet?
0
0
5
🌞 We're excited to share our "Summer of Data Seminar" series at @datologyai!. We're hosting weekly sessions with brilliant researchers diving deep into pretraining, data curation, and everything that makes datasets tick. Are you data-obsessed yet? 🤓. Thread 👇
1
8
41
RT @LucasAtkins7: We teamed up with @datologyai to build what we believe is the strongest pretraining corpus in the world—and I truly think….
0
5
0
RT @arimorcos: Congratulations to our friends and partners @arcee_ai on the release of AFM-4.5B!. With data powered by @datologyai, this mo….
0
11
0
Congrats to @LucasAtkins7 and @arcee_ai on a fantastic model release! . DatologyAI powers the data behind AFM-4.5B, and we're just getting started.
Our customers needed a better base model <10B parameters. We spent the last 5 months building one. I'm delighted to share a preview of our first Arcee Foundation Model: AFM-4.5B-Preview.
0
3
32
RT @gm8xx8: Datology CLIP Models. DatologyAI releases two SOTA CLIP ViT-B/32 variants: classification-optimized and retrieval-optimized, ac….
0
13
0
RT @LucasAtkins7: . @datologyai is pushing the frontier, with data curation as its standout advantage. After working closely with the team….
0
5
0
RT @RicardoMonti9: . @datologyai is back: state of the art CLIP model performance using data curation alone 🚀. ✅ state-of-the-art ViT-B/32….
0
23
0
RT @arimorcos: We couldn't agree more. If you also believe this, come work with us @datologyai to help drive frontier research and engineer….
0
4
0
RT @thao_nguyen26: 📢 Announcing our data-centric workshop at ICML 2025 on unifying data curation frameworks across domains!. 📅 Deadline: Ma….
0
26
0
RT @LucasAtkins7: What an insane get for an insane team. We’ve been working with @datologyai closely and I assure you if anything they sell….
0
5
0