Mohammed Alshehri
@SwishMoe
Followers
159
Following
6K
Media
446
Statuses
1K
23 | Applied RL, Post Training → building and learning
London/Riyadh
Joined July 2017
The Genesis Mission is launched! Big day for American Science and AI.
411
820
6K
Introducing INTELLECT-3: Scaling RL to a 100B+ MoE model on our end-to-end stack Achieving state-of-the-art performance for its size across math, code and reasoning Built using the same tools we put in your hands, from environments & evals, RL frameworks, sandboxes & more
88
176
1K
Huge step forward. Open access to frontier-scale MoE infrastructure at the 100B+ level is going to shift how the field moves. Just grabbed the report and excited to dig into Prime-RL. @PrimeIntellect
1
0
3
Keys notes from @ilyasut Podcast Scaling model size alone is no longer enough to drive major AI progress Future breakthroughs depend on new research ideas, not more data and compute Current models still generalize far worse than humans Compute should enable rapid
0
0
0
Understand the code generated by the AI. Understand the code generated by the AI. Understand the code generated by the AI. Understand the code generated by the AI. Understand the code generated by the AI.
Stop using AI to code. Stop using AI to code. Stop using AI to code. Stop using AI to code. Stop using AI to code. Stop using AI to code. Stop using AI to code. Stop using AI to code.
0
0
1
AI has alternated between research and scaling from the beginning: 1950s: Research (early days) 1960s: Scaling (search) 1970s: Research (knowledge representation) 1980s: Scaling (expert systems, Cyc) 1990s: Research (machine learning) 2000s: Scaling (data mining) 2010s: Research
38
52
458
The @ilyasut episode 0:00:00 – Explaining model jaggedness 0:09:39 - Emotions and value functions 0:18:49 – What are we scaling? 0:25:13 – Why humans generalize better than models 0:35:45 – Straight-shotting superintelligence 0:46:47 – SSI’s model will learn from deployment
390
1K
8K
> born in Chicago > started coding as a kid > got his first computer at 8.. changed everything > studied at Stanford (dropped out to build things) > built Loopt at 19.. raised $30M > became president of Y Combinator > helped launch and mentor companies like Airbnb, Stripe,
40
34
528
We have clear-cut SOTA on Neural MMO 3, our hardest RL task, with 650B training steps (>1 PB of observations per run). Flop-matched and parameter-matched. The issue? In order to make this useful, I'm competing with cuDNN LSTM for perf. And this net requires several kernels.
4
5
149
Bullshit. Do it simpler. Do it smarter. Do it faster. Do it better. Do it cheaper. Do it with AI. Do it in a new country. Do it with better design. Do it automated. Do it personalized. Do it with a social mission. Do it sustainably. Do it w/ top-notch service. Do it on
115
161
2K
Codex did achieved a milestone : the only threat to Claude in the market .
It has been amazing to watch the progress of the Codex team; they are beasts. The product/model is already so good and will get much better; I believe they will create the best and most important product in the space, and enable so much downstream work.
0
0
0
Without math, no cryptography. Without math, no artificial intelligence. Without math, no data science. Without math, no computer science. Without math, no physics. Without math, no engineering. Without math, no electronics. Without math, no astronomy. Without math, no
358
684
5K
Back in Riyadh, fully locked in and ready to push this product to production.
1
0
3
Robotics powered by LLMs isn’t sci-fi anymore. Machines that understand, adapt, and learn in real time. This is the space I’m betting on.
0
0
1