
Mahaoo
@mahaoo_ASI
Followers
494
Following
24K
Media
72
Statuses
3K
context assembler and precise request specifier unhinged socially unacceptable takes about AI
Joined April 2024
There are only two important engineering problems in the world. 1) Probability density estimation in high dimensions.2) Solving high dimensional optimization problems. Both problems are not solvable in the general case (curse of dimensionality, NP hard). But in practice,.
1
2
38
I still kinda bogles my mind that the CEO of a ~$40B robotics company is acting like a twitter-influencer-threadboy in the last ~2 years with these posts. very weird behavior.
Significant progress in AI and Robotics this week. So, I summarized everything from OpenAI, Google, xAI, Anthropic, Figure, Unitree, OpenMind, Microsoft, Perplexity, ElevenLabs, and more. Here's everything you need to know and how to make sense out of it:.
0
0
1
It is becoming clear that openai is responding to market demans and will try to make model nicer and more polite (and more sycophantic), etc. This is OK as long as this is not in any way harming the focus on making the models smarter or more capable. its a classic case of users.
Wanted to provide more updates on the GPT-5 rollout and changes we are making heading into the weekend. 1. We for sure underestimated how much some of the things that people like in GPT-4o matter to them, even if GPT-5 performs better in most ways. 2. Users have very different.
0
0
5
based on the live stream, it appears to be even more incremental than expected.
Registering my GPT-5 predictions. Base case (90% confidence): .incremental improvement across the board.longer and better context window, .better vision (with video input), .better tool use, .better judgment,.fewer hallucinations,.etc. Basically, overnight SOTA on all axes at.
0
0
1
this is the way!. deepmind for the win?.
Thrilled to announce the @Kaggle Game Arena, a new leaderboard testing how modern LLMs perform on games (spoiler: not very well atm!). AI systems play each other, making it an objective & evergreen benchmark that will scale in difficulty as they improve.
1
0
2