Inference
@inference_net
Followers
29K
Following
345
Media
99
Statuses
332
Inference Research & Development
San Francisco, CA
Joined March 2024
Onward.
I'm excited to announce @inference_net's $11.8M Series Seed funding round, led by @multicoincap & @a16zcrypto CSX, with participation from @topology_vc, @fdotinc, and an incredible group of angels. The next wave of AI adoption will be driven by companies building AI natively
41
21
217
50M impressions on X for Inference net in the last 3 months Q1 2026 we will do ~100M impressions on X alone
16
5
54
This took a lot of trial and error to get right, particularly to train the long context summarizing models. The golden model ended up being hybrid attention, and actually unlocked the ability to process the 100M papers we will release soon
We're introducing Project AELLA, in partnership with @laion_ai & @wyndlabs_ai AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs Available now: - Dataset of 100K summaries - 2 fine-tuned LLMs - 3d visualizer π
1
6
24
Due to an unforeseen naming conflict, we are renaming Project AELLA to Project OSSAS (Open Source Summaries At Scale) Thank you to those who brought the context surrounding this name to our attention, and to our partners and the research community for their ongoing support.
We're introducing Project AELLA, in partnership with @laion_ai & @wyndlabs_ai AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs Available now: - Dataset of 100K summaries - 2 fine-tuned LLMs - 3d visualizer π
244
233
5K
We're introducing Project AELLA, in partnership with @laion_ai & @wyndlabs_ai AELLA is an open-science initiative to make scientific research accessible via structured summaries created by LLMs Available now: - Dataset of 100K summaries - 2 fine-tuned LLMs - 3d visualizer π
107
187
2K
Inference verification thatβs actually functional and economical. One of the coolest projects Iβve gotten to be apart of, shoutout to @AmarSVS for leading the charge. Many more to come!
Today, we release LOGIC: A novel method for verifying LLM inference in trustless environments. - Detects model substitution, quantization, and decode-time attacks - Works out of the box with @vllm_project, @sgl_project, @OpenRouterAI, and more (just need logprops) - Robust
1
1
16
We've been running LOGIC on our globally distributed, permissionless inference network for the last 3 months. Today, we feel confident in saying that LOGIC provides production-ready trust for open inference networks. Learn more:
0
0
21
LOGIC verifies the statistical fingerprint of model outputs Instead of recreating exact activations, we verify that token-level log-probability distributions match the claimed model. - Operators provide top-k log-probs during generation - Randomly sample decode positions and
1
0
15
How do we verify that GPU operators are running the models they claim to be? Inference providers are incentivized to minimize costs, often at the expense of their users. The primary vectors for this are: - Model substitution - Quantization spoofing - Spec-decoding attacks
1
0
12
Today, we release LOGIC: A novel method for verifying LLM inference in trustless environments. - Detects model substitution, quantization, and decode-time attacks - Works out of the box with @vllm_project, @sgl_project, @OpenRouterAI, and more (just need logprops) - Robust
8
12
76
Our testing of @nvidiaβs Nemotron-Nano-12B-v2 confirms ~2.5x higher token throughput over standard transformer-based models of a similar size, like Qwen-3-14b This increase holds even when fine-tuning for specific tasks American open source FTW πΊπΈπΊπΈπΊπΈ Full details below
1
5
36
Inference Core @atbeme - cto @bonham_sol - founding protocol @AmarSVS - founding ml @francescodvirga - founding infra @0xSamHogan - ceo
19
5
76
Using custom-trained LLMs and > 1k 4090s to visualize 100k scientific research papers in latent space π DM me for early access π
228
351
4K
Today we're releasing Schematron: 3B and 8B models that extract typed JSON from HTML with frontier-level accuracy. We built Schematron to be the ultimate πΈπ°π³π¬π©π°π³π΄π¦ for data extraction: β 50-100x cheaper than GPT-5 β ~10x lower latency β Both models outperform Gemini 2.5
10
14
130
We worked with the @calai_app team to train a custom calorie estimation model for their core user flows. Today, this custom model powers 100% of their traffic, serving millions of users worldwide. The model is: - Better than GPT5 - 3x faster - 50% cheaper read it and weep π
50
22
716
While frontier models excel at agentic search, they are prohibitively expensive and slow for such token-intensive tasks. This is a problem, since search precision tends to scale with tokens processed. The solution is small, carefully RL-trained models tailored to individual
inference.net
Thanks to the resurgence of RL, LLMs are finally able to reliably coordinate tools and reasoning to do high-precision retrieval. Companies like Happenstance, Clado, and Mintlify have already shifted to agentic search, and it's only a matter of time until anything less feels
8
4
45
π
Big news: Iβm joining @inference_net as a Machine Learning Engineer/Researcher. Huge thanks to everyone whoβs been part of this journey: β’ @Teknium1 my biggest inspiration, his work at Nous pushed me forward. β’ @TroyQuasar for being there through the grind. β’ And my whole
7
0
42