Callosum @CallosumAI X Profile

Callosum

@CallosumAI

Followers

897

Following

35

Media

14

Statuses

29

The Intelligent Systems Company. Co-evolving chips & intelligence.

https://t.co/0FoZQsdEZj

London, UK

Joined February 2026

Don't wanna be here? Send us removal request.

Callosum

@CallosumAI

9 days

The future of compute & AI is heterogeneous. That future is being built by Callosum. Today we launch with new breakthroughs made possible by heterogeneity, new scaling principles & a roadmap reimagining compute & AI infrastructure: https://t.co/Yp1Qwt45tM

callosum.com

Callosum - Co-evolving chips & intelligence

8

29

173

Danyal Akarca

@DanAkarca

1 day

and @CallosumAI has launched in the heart of London with a $10.25m pre-seed + world-class team. We're hiring: https://t.co/ZVqOI9Nnab

Alex Banks

@thealexbanks

1 day

incredibly bullish on the future of tech + AI in London. just to name a few: • OpenAI just announced (last week) that London will become its largest research hub outside San Francisco • Anthropic kicked off a 100+ person hiring spree across London and Dublin in 2025 • xAI

4

1

49

Jascha Achterberg

@achterbrain

9 days

We are at a unique moment in time for AI & compute: New accelerators / chips, HPC hardware, and new algorithms have each made strides, but we are not yet orchestrating them as a heterogeneous stack. That is what @CallosumAI is built to do, and today we are sharing our vision 🧵

Callosum

@CallosumAI

9 days

Today we launched @CallosumAI. We are building the infrastructure where heterogeneous chips & intelligence co-evolve to solve the world's hardest problems. Today we present our first results. Across four large problem spaces, we break SOTA and deliver orders-of-magnitude

4

9

28

Callosum

@CallosumAI

9 days

Everything here is early evidence for a deeper thesis: as the problems we need to solve grow in difficulty, the systems that solve them must grow in diversity. Heterogeneous systems - diverse models on diverse hardware, co-evolved end-to-end - unlock scaling territory that

callosum.com

Callosum - Co-evolving chips & intelligence

1

10

Callosum

@CallosumAI

9 days

None of these results came from a bigger model. 12× cheaper deep context. New web SOTA with open-source, 3x cheaper and faster. 2.4× cache speedups. 1,767× faster tool calling. All from heterogeneity - mixed models, mixed chips, mixed scales - co-evolved end-to-end.

1

0

9

Callosum

@CallosumAI

9 days

This changes what small models can do. 8 × 1B models generating grammar-constrained candidate tool-calls with a naive voting schema: 42.27% accuracy on structured data extraction - a +11 point improvement over a single greedy pass from the same model and +2 points over an 8B

1

0

8

Callosum

@CallosumAI

9 days

We moved the entire operation on-die on @awscloud Inferentia2. JSON schemas compile into finite state machines, a custom NKI kernel performs constrained decoding entirely in NeuronCore SBUF. The mask lives in on-chip SRAM, right alongside the logits. O(1) scaling. 1.4μs at

1

0

7

Callosum

@CallosumAI

9 days

Finally, Tool Calling Problems. Every class of problem above shares a common dependency: tool calling. It's how agents act on the world. Get it wrong and performance breaks. Get it right but too slowly, and the economics break instead. The bottleneck: grammar enforcement is

1

0

7

Callosum

@CallosumAI

9 days

This already extends to real workflows. 20% speedup out of the box on a podcast generation task with large system prompts - before any deeper optimisations. And eviction is just the beginning. Topology-awareness enables pre-fetching context before it's needed, hierarchical

1

0

9

Callosum

@CallosumAI

9 days

We replace heuristic eviction with topology-aware cache management: evict the node furthest from future use. Provably optimal (Bélády, 1966). In a 6-agent loop with capacity for 5: LRU evicts the next node needed - 6 cache misses on the second iteration. Ours: one. Up to 2.4×

1

0

10

Callosum

@CallosumAI

9 days

When then moved to Cache-Intensive Problems. As heterogeneous systems scale - more models, more steps, more branching - redundant computation compounds. Every inference call that recomputes tokens the system has already seen is wasted work. Today's eviction policies (LRU, LFU)

1

0

9

Callosum

@CallosumAI

9 days

Our infra identified that GPT-5.2 struggled with reliable coordinate localisation - frequently selecting wrong locations even with assistance. These are moments where the system must actively engage its environment rather than reason about it. Precisely where heterogeneity pays

1

0

9

Callosum

@CallosumAI

9 days

Heterogeneity shifts the Pareto frontier and it's model-agnostic. Pairing Qwen3-VL-8B with GPT-5.2: 3.7× cheaper (~$0.22 vs ~$0.83 per task), 3× faster. With Kimi-K2.5: same pattern. The gains come from two sources: cheaper per-step inference and fewer total steps through more

1

0

8

Callosum

@CallosumAI

9 days

But the benchmark is the starting point, not the destination. The system already generalises to problems it has never seen. We asked it to identify a robot across two images, find it on Amazon UK, compare prices on OnBuy, and purchase it. Live websites, multi-modal reasoning,

2

1

10

Callosum

@CallosumAI

9 days

We hit new SOTA on VisualWebArena shopping tasks - 1.18× SGV and 1.25× WALT (both ICLR 2026). 66% pass rate. Using only open-source vision-language-action models (Kimi-K2.5 + Qwen3-VL-8B-Instruct). Zero frontier API calls. Every prior system on this benchmark relied on

1

0

9

Callosum

@CallosumAI

9 days

Next, harder environments - Open Web Problems. The internet is a much more complex, open-ended environment. A single task can require visual perception, text comprehension, long-horizon planning, precise spatial targeting, and real-time adaptation to a live interface. No amount

1

0

10

Callosum

@CallosumAI

9 days

In practice: our partners @coworkerapp whose autonomous agents handle millions of tokens within complex enterprise workflows, generate status reports from raw activity logs - long, noisy contexts where signals are sparse and the workflow demands retrieval, attribution,

1

0

10

Callosum

@CallosumAI

9 days

The partitioning matters more than the raw capability of any single component. We unlock a configuration space that single-model systems can't access - many configurations achieve comparable accuracy, but at vastly different price and speed points. For example, Cerebras

1

0

13

Callosum

@CallosumAI

9 days

We start with Deep Context Problems - ubiquitous in production AI: sifting dynamic information, sustained reasoning over databases, selecting task-relevant signals at the timescale the task demands. A single workflow here contains fundamentally different computation: rapid

3

1

13

Callosum

@CallosumAI

9 days

Today we launched @CallosumAI. We are building the infrastructure where heterogeneous chips & intelligence co-evolve to solve the world's hardest problems. Today we present our first results. Across four large problem spaces, we break SOTA and deliver orders-of-magnitude

callosum.com

Callosum - Co-evolving chips & intelligence

9

33

99

Callosum

@CallosumAI

9 days

Why @pluralplatform invested in Callosum - by @soundboy Read here: https://t.co/eA50t7KFBr

0

3

16