essential_ai Profile Banner
Essential AI Profile
Essential AI

@essential_ai

Followers
4K
Following
37
Media
9
Statuses
45

Our mission is to deepen the partnership between humans and computers, unlocking collaborative capabilities that far exceed what could be achieved today.

San Francisco, CA
Joined August 2023
Don't wanna be here? Send us removal request.
@essential_ai
Essential AI
28 days
Why run the same race when we can pioneer our own path? Thats how we approach AI, by taking big bets and pushing on the foundations of AI đź’Ą. Check out @ashVaswani's recent interview with @EconomicTimes
Tweet media one
5
8
79
@essential_ai
Essential AI
18 days
RT @KarimBhalwani: Wonderful moderating a fireside chat with @ashVaswani, the co-creator of the Transformer, at @AMD's Advancing AI event.….
0
1
0
@essential_ai
Essential AI
23 days
RT @AIatAMD: Smarter Data Starts Here! .Meet Essential-Web v1.0 — 24T tokens, fully annotated, filterable in seconds. No scraping. No pipe….
0
11
0
@essential_ai
Essential AI
26 days
[5/5]. We’d like to hear your thoughts. Please reach out at research@essential.ai. Papers included in the post:.
1
1
17
@essential_ai
Essential AI
26 days
[4/5]. This doesn't conclude that Muon or second-order methods in general don't grok faster than AdamW, but that we didn't discover any clear pattern. We look forward to hearing what the community has to say about our findings.
1
0
19
@essential_ai
Essential AI
26 days
[3/5]. When expanding our hyperparameter search space to Transformer base embedding dimensions and training batch sizes, muon doesn't have any distinct advantage. AdamW and Muon outcompete each other in different scenarios.
1
0
15
@essential_ai
Essential AI
26 days
[2/5]. We expected (and hoped) that Muon would Grok faster on the classic modular division task, given our positive results in [Shah et al., 2025]. Contemporaneous work [Tveit et al.] suggests this to be the case. Check out what we found:
1
1
15
@essential_ai
Essential AI
26 days
[1/5]. We have a quick update to share, which contradicts our hypothesis regarding the abilities of Muon and Adam vis-a-vis Grokking.
Tweet media one
5
21
120
@essential_ai
Essential AI
28 days
0
0
3
@essential_ai
Essential AI
29 days
RT @RitvikKapila: #1 trending on @huggingface letsgoooo!. @essential_ai 🥇
Tweet media one
0
13
0
@essential_ai
Essential AI
1 month
[5/5]. model and data: code:
2
3
28
@essential_ai
Essential AI
1 month
[4/5]. We walk through each step of the design process—category metrics, the distillation pipeline, and multi-domain evaluations—in the paper.
Tweet media one
1
2
22
@essential_ai
Essential AI
1 month
[3/5]. To evaluate Essential-Web v1.0, we curate domain-specific datasets, ranging in size from 29B to 1.7T tokens, using lightweight metadata filters. We then anneal a 2.3B-parameter Transformer on these datasets along with top-performing, web-based datasets for each domain. On.
1
2
21
@essential_ai
Essential AI
1 month
[2/5]. Paper link: We label 23.6B documents from Common Crawl with a 12-category taxonomy using our distilled model, EAI-Distill-0.5B. On held-out evaluation sets, its annotator agreement with our reference annotators, GPT-4o and Claude Sonnet 3.5, is
Tweet media one
2
4
33
@essential_ai
Essential AI
1 month
[1/5]. 🚀 Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!
Tweet media one
11
54
297
@essential_ai
Essential AI
1 month
RT @KarimBhalwani: I had the privilege of learning about Transformers in @DivGarg_’s Stanford class from none other than @ashVaswani, the c….
0
4
0
@essential_ai
Essential AI
1 month
RT @essential_ai: @ashVaswani will be joining @KarimBhalwani on stage tomorrow at 1 PM at @AMD’s #AdvancingAI conference. Get ready for an….
0
8
0
@essential_ai
Essential AI
1 month
RT @ForbesIndia: #30IndianMindsInAI: @essential_ai, co-founder @ashVaswani says, focuses on solving humanity’s enormous challenges through….
0
1
0
@essential_ai
Essential AI
1 month
RT @Accel_India: 1/ “Steve Jobs famously spoke about blending liberal arts and technology. Leadership that sets clear research convictions….
0
1
0
@essential_ai
Essential AI
2 months
Muon is a serious competitor to AdamW, but it's tricky to scale up. Our infra team has made fundamental advancements in parallelizing Muon on large scale distributed clusters. We're extremely happy with the result and it's now a part of our pretraining pipeline. đź”—Check out.
0
16
106