Antigma
@antigma_labs
Followers
789
Following
145
Media
9
Statuses
69
building substrate for self-organizing intelligence.
California, USA
Joined July 2024
Before a detailed write-up, here are key findings: 1. Implementation Matters Ante, built with native Rust for security, performance, and resistance to AI-generated slop, validates our assumption that engineering quality is crucial, showing significant accuracy improvements and
Claude Sonnet 4.5 is indeed an amazing upgrade :š„³! Thanks @AnthropicAI ! We are #1 on Terminal Bench now. In the coming days, we will share what we learned while building Ante, and how we initially only used terminal bench as baseline eval and but discovered something more
5
0
9
Be Monk. @NoCommas Physics and Geomatics background. Discovered CS through satellite neural nets. Monk scaled the original open distributed system (DNS) at AWS & Azure. Then he moved to Meta & Mysten to build the new coordination layer. "A Turing-complete mind wandering in a
3
9
63
As our yearly Thanksgiving tradition, we are glad to provide the OSS community with a tiny GPT-style cognitive core built in pure Rust. This is a Rust implementation of @karpathy's great nanoGPT. It is a token of gratitude to the community as well as a significant step toward our
6
16
312
Substrate for self-organizing intelligence starts from self-hosting!
To be self-organizing, you need to be self-sustaining; To be self-sustaining, you need to be self-hosting. Honored to help on the cause and thanks for inviting me @VoidAsuka @Gradient_HQ
1
0
6
How I did it? I simply run ``` ante "rebuilt karpathy/nanochat in pure Rust, make no mistake" ```
0
1
3
āIt is not you, it is the modelā and yes āimplementation of agentā matters
Codex degradation report TLDR: - The investigation found no single root cause; rather a mix of behavior shifts and small bugs, with several fixes already shipped and more on the way. - Older hardware underperformed in evals and was removed, and improved load balancing is
0
1
3
Model is just dead weights (pun intended) , long live the agents
Agency > Intelligence I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are
1
1
5
Wondering where we get the inspiration for our website's animation? It is from a small experiment we did earlier with Neural Cellular Automata Read more below:
9
29
342
3. Itās not you, itās the model. LLM APIs differ from traditional services. Quality degradation isnāt just latency or errors; content quality is harder to track (e.g., https://t.co/DWXkorirTQ). We noticed this Anthropic event via evaluation, reinforcing our commitment to
anthropic.com
This is a technical report on three bugs that intermittently degraded responses from Claude. Below we explain what happened, why it took time to fix, and what we're changing.
1
0
3
2. Errors Compound This relates to the first point: our high standards pay off, even if small initially. The Terminal Bench Task must be completed within a fixed time, crucial for real-world use as a productivity tool. Getting it right early saves time and tokens.
1
0
1
Claude Sonnet 4.5 is indeed an amazing upgrade :š„³! Thanks @AnthropicAI ! We are #1 on Terminal Bench now. In the coming days, we will share what we learned while building Ante, and how we initially only used terminal bench as baseline eval and but discovered something more
2
1
17
Record is made to be broken, we just achieved 54.8% on sonnet 4. Fixed it for youš«”
Droid has reached #1 on Terminal-Bench, the most challenging general software development benchmark, outperforming popular tools like Claude Code and Codex CLI. Terminal Bench goes beyond just coding and evaluates agents on a broader set of tasks to modernize legacy code, debug
2
2
7