Philip Monk
@pcmonk
Followers
2K
Following
2K
Media
73
Statuses
2K
A man alive, walking on two legs about the world. Infra lead @essential_ai
San Francisco, CA
Joined January 2012
.@essential_ai's rnj-1 model is now on Ollama! ollama run rnj-1 8B parameter, open-weight dense model trained from scratch. The model is optimized for code and STEM with capabilities on par with other state of the art open-weight models. Let's go! ๐๐๐
9
33
244
It's open weights and a very convenient size to run locally, btw. I get 20 tok/s on an M3 mac with llama.cpp.
0
0
6
It's been a blast to lead the infrastructure effort to train this model. I'm excited to see it out in the world!
We are beyond thrilled to share our first flagship models, Rnj-1 base and instruct 8B parameter models. Rnj-1 is the culmination of 10 months of hard work by a phenomenal team, dedicated to advancing American SOTA OSS AI. Lots of wins with Rnj-1. 1. SWE bench performance close
1
0
13
Today, weโre excited to introduce Rnj-1, @essential_ai's first open model; a world-class 8B base + instruct pair, built with scientific rigor, intentional design, and a belief that the advancement and equitable distribution of AI depend on building in the open. We bring
37
153
1K
You all have it so easy today with your petaflop gpus. In my day we had *floppy disks* that could only handle a few hundred kiloflops/s
1
0
0
[1/2] We at Essential are driven by mission to advance fundamental research guided by first principles, rigor and sharing research openly.
1
10
31
If you're a PL guy who engages seriously with the problem the answer you come to is Jax and/or torch.compile, not rust
1
0
6
Protip: function in the presence of uncertainty in your own mind. Your tweets will be worse but you will be more aligned with reality
Complete certainty is impossible. So for any belief itโs always possible to be wrong. The Sun might not rise tomorrow. Yet at some point, in order to function we must round the chance of error to zero on many beliefs. On what principled basis might we decide to do this?
0
0
1
[1/5] ๐ Meet Essential-Web v1.0, a 24-trillion-token pre-training dataset with rich metadata built to effortlessly curate high-performing datasets across domains and use cases!
12
54
301
We've always been reaching for the glory of the previous generation
1
0
2
Leaving aside the main point of the post, I think people underestimate the degree to which the previous generation was influenced by stories of the one before that. There are more total stories now, but the number that affects any particular person is probably not any higher
that jony and sam video has me thinking about something - i'll try to explain it the previous generation of silicon valley there were not yet too many stories of silicon valley people did the things they were trying to do, went through twists and turns and then later the
1
0
3
This was a super interesting project to work on. It showed up when we started using muon at larger scales.
Muon is a serious competitor to AdamW, but it's tricky to scale up. Our infra team has made fundamental advancements in parallelizing Muon on large scale distributed clusters. We're extremely happy with the result and it's now a part of our pretraining pipeline. ๐Check out
1
0
5
Avoiding spoilers is pretty difficult in general, but I was pleasantly surprised this just works
1
0
4
I don't know who needs to hear it, but for 100s of years there has been a significant tribe that believes basically this. Something in their worldview hits inf or nan and they start believing the end of the world is in the next 10 years. Reject it, don't let it make you impotent.
1
3
22
I'm more pro-civilization than anything else. But even though the Louvre is a valuable part of our civilization, it's much less valuable than what generally efficient allocation of capital gives us.
0
0
1
This and the related thread is a good example of precisely where I differ from a lot of trads. I don't actually think the Colossus or Louvre was/is more interesting than what finance and tech has brought us.
I guess my own problem is that I think the most interesting things in the world were a result of *inefficient* allocation of capital. You don't build the Colossus of Rhodes or the Louvre - or almost everything inside of the Louvre - by expecting a return on investment.
1
0
3