CallosumAI Profile Banner
Callosum Profile
Callosum

@CallosumAI

Followers
897
Following
35
Media
14
Statuses
29

The Intelligent Systems Company. Co-evolving chips & intelligence.

London, UK
Joined February 2026
Don't wanna be here? Send us removal request.
@CallosumAI
Callosum
9 days
The future of compute & AI is heterogeneous. That future is being built by Callosum. Today we launch with new breakthroughs made possible by heterogeneity, new scaling principles & a roadmap reimagining compute & AI infrastructure: https://t.co/Yp1Qwt45tM
Tweet card summary image
callosum.com
Callosum - Co-evolving chips & intelligence
8
29
173
@DanAkarca
Danyal Akarca
1 day
and @CallosumAI has launched in the heart of London with a $10.25m pre-seed + world-class team. We're hiring: https://t.co/ZVqOI9Nnab
@thealexbanks
Alex Banks
1 day
incredibly bullish on the future of tech + AI in London. just to name a few: • OpenAI just announced (last week) that London will become its largest research hub outside San Francisco • Anthropic kicked off a 100+ person hiring spree across London and Dublin in 2025 • xAI
4
1
49
@achterbrain
Jascha Achterberg
9 days
We are at a unique moment in time for AI & compute: New accelerators / chips, HPC hardware, and new algorithms have each made strides, but we are not yet orchestrating them as a heterogeneous stack. That is what @CallosumAI is built to do, and today we are sharing our vision 🧵
@CallosumAI
Callosum
9 days
Today we launched @CallosumAI. We are building the infrastructure where heterogeneous chips & intelligence co-evolve to solve the world's hardest problems. Today we present our first results. Across four large problem spaces, we break SOTA and deliver orders-of-magnitude
4
9
28
@CallosumAI
Callosum
9 days
Everything here is early evidence for a deeper thesis: as the problems we need to solve grow in difficulty, the systems that solve them must grow in diversity. Heterogeneous systems - diverse models on diverse hardware, co-evolved end-to-end - unlock scaling territory that
Tweet card summary image
callosum.com
Callosum - Co-evolving chips & intelligence
1
1
10
@CallosumAI
Callosum
9 days
None of these results came from a bigger model. 12× cheaper deep context. New web SOTA with open-source, 3x cheaper and faster. 2.4× cache speedups. 1,767× faster tool calling. All from heterogeneity - mixed models, mixed chips, mixed scales - co-evolved end-to-end.
1
0
9
@CallosumAI
Callosum
9 days
This changes what small models can do. 8 × 1B models generating grammar-constrained candidate tool-calls with a naive voting schema: 42.27% accuracy on structured data extraction - a +11 point improvement over a single greedy pass from the same model and +2 points over an 8B
1
0
8
@CallosumAI
Callosum
9 days
We moved the entire operation on-die on @awscloud Inferentia2. JSON schemas compile into finite state machines, a custom NKI kernel performs constrained decoding entirely in NeuronCore SBUF. The mask lives in on-chip SRAM, right alongside the logits. O(1) scaling. 1.4μs at
1
0
7
@CallosumAI
Callosum
9 days
Finally, Tool Calling Problems. Every class of problem above shares a common dependency: tool calling. It's how agents act on the world. Get it wrong and performance breaks. Get it right but too slowly, and the economics break instead. The bottleneck: grammar enforcement is
1
0
7
@CallosumAI
Callosum
9 days
This already extends to real workflows. 20% speedup out of the box on a podcast generation task with large system prompts - before any deeper optimisations. And eviction is just the beginning. Topology-awareness enables pre-fetching context before it's needed, hierarchical
1
0
9
@CallosumAI
Callosum
9 days
We replace heuristic eviction with topology-aware cache management: evict the node furthest from future use. Provably optimal (Bélády, 1966). In a 6-agent loop with capacity for 5: LRU evicts the next node needed - 6 cache misses on the second iteration. Ours: one. Up to 2.4×
1
0
10
@CallosumAI
Callosum
9 days
When then moved to Cache-Intensive Problems. As heterogeneous systems scale - more models, more steps, more branching - redundant computation compounds. Every inference call that recomputes tokens the system has already seen is wasted work. Today's eviction policies (LRU, LFU)
1
0
9
@CallosumAI
Callosum
9 days
Our infra identified that GPT-5.2 struggled with reliable coordinate localisation - frequently selecting wrong locations even with assistance. These are moments where the system must actively engage its environment rather than reason about it. Precisely where heterogeneity pays
1
0
9
@CallosumAI
Callosum
9 days
Heterogeneity shifts the Pareto frontier and it's model-agnostic. Pairing Qwen3-VL-8B with GPT-5.2: 3.7× cheaper (~$0.22 vs ~$0.83 per task), 3× faster. With Kimi-K2.5: same pattern. The gains come from two sources: cheaper per-step inference and fewer total steps through more
1
0
8
@CallosumAI
Callosum
9 days
But the benchmark is the starting point, not the destination. The system already generalises to problems it has never seen. We asked it to identify a robot across two images, find it on Amazon UK, compare prices on OnBuy, and purchase it. Live websites, multi-modal reasoning,
2
1
10
@CallosumAI
Callosum
9 days
We hit new SOTA on VisualWebArena shopping tasks - 1.18× SGV and 1.25× WALT (both ICLR 2026). 66% pass rate. Using only open-source vision-language-action models (Kimi-K2.5 + Qwen3-VL-8B-Instruct). Zero frontier API calls. Every prior system on this benchmark relied on
1
0
9
@CallosumAI
Callosum
9 days
Next, harder environments - Open Web Problems. The internet is a much more complex, open-ended environment. A single task can require visual perception, text comprehension, long-horizon planning, precise spatial targeting, and real-time adaptation to a live interface. No amount
1
0
10
@CallosumAI
Callosum
9 days
In practice: our partners @coworkerapp whose autonomous agents handle millions of tokens within complex enterprise workflows, generate status reports from raw activity logs - long, noisy contexts where signals are sparse and the workflow demands retrieval, attribution,
1
0
10
@CallosumAI
Callosum
9 days
The partitioning matters more than the raw capability of any single component. We unlock a configuration space that single-model systems can't access - many configurations achieve comparable accuracy, but at vastly different price and speed points. For example, Cerebras
1
0
13
@CallosumAI
Callosum
9 days
We start with Deep Context Problems - ubiquitous in production AI: sifting dynamic information, sustained reasoning over databases, selecting task-relevant signals at the timescale the task demands. A single workflow here contains fundamentally different computation: rapid
3
1
13
@CallosumAI
Callosum
9 days
Today we launched @CallosumAI. We are building the infrastructure where heterogeneous chips & intelligence co-evolve to solve the world's hardest problems. Today we present our first results. Across four large problem spaces, we break SOTA and deliver orders-of-magnitude
Tweet card summary image
callosum.com
Callosum - Co-evolving chips & intelligence
9
33
99
@CallosumAI
Callosum
9 days
Why @pluralplatform invested in Callosum - by @soundboy Read here: https://t.co/eA50t7KFBr
0
3
16