DeepInfra
@DeepInfra
Followers
4K
Following
132
Media
45
Statuses
486
Fast ML inference. Run top AI models using a simple API.
Palo Alto
Joined February 2023
Here is the final result. When I ran the eval, only three providers had correctly implemented reasoning efforts on OR: @DeepInfra, @FireworksAI_HQ, @Google
@CrusoeAI's (medium) performs like high, what is going on there?! Total cost: $44, the eval ran for 15 hours
what the actual fuck is even happening reminder: scores are ~80 (high), ~73 (med), ~68 (low) dunno what those providers even do 🤦♂️
5
12
71
Nice upgrade. Embeddings unlock better search, memory, and agents. DeepInfra is ready for high-throughput pipelines; DMs open for help picking a model.
📣 HUGE shoutout to our friends at @DeepInfra You can find all the embedding models here
1
1
4
Now live: Kimi K2 Thinking on DeepInfra. @Kimi_Moonshot's most capable open “thinking” model built for complex reasoning and planning. Best price as usual: $0.55 in / $2.50 out.
1
1
5
check out the new model here
deepinfra.com
Low pay-as-you-go pricing. No long-term contracts. Simple APIs. Scale to trillions of tokens. 100+ AI models.
0
1
2
These numbers are amazing for an open-source model. We're working on bringing this model up for Perplexity users with our own deployment in US data centers.
123
132
2K
A bit late but we had fun at @nvidia Washington DC GTC conference! We even had a celeb sighting 👀 Was great to meet new people, see friendly face, and learn more about the latest and greatest in AI #jensenhuang #inception
0
1
2
Love this update from @Kimi_Moonshot team. Tool calling matters a ton for agentic AI, and DeepInfra is proud to be at the top spot after the official provider with 100% accuracy. We’re committed to best-in-class quality and service, as always at the best price!
Kimi K2vv updated! We've added case-by-case statistics for ToolCall-Trigger Similarity and ToolCall-Schema Accuracy. Feedback is welcome! https://t.co/MvvyAhlO0I
2
0
5
Now live: NVIDIA Nemotron Nano 12B VL on DeepInfra: multimodal (VL + OCR), agent-ready. $0.20 in / $0.60 out per Mtoken, best price as usual.
1
1
1
Thrilled to be a Day 0 partner with @nvidia. Nemotron vision-language model is now served on DeepInfra. More details here: https://t.co/Ws0KPBOsNh
1
1
7
We’re the first to host the newest OCR model by @allen_ai - olmOCR-2-7B-1025, now live on DeepInfra. $0.14 in / $0.80 out. Turn PDFs & scans into clean text with tables, equations, handwriting & more.
We’re updating olmOCR, our model for turning PDFs & scans into clean text with support for tables, equations, handwriting, & more. olmOCR 2 uses synthetic data + unit tests as verifiable rewards to reach state-of-the-art performance on challenging documents. 🧵
1
0
8
Happy hour trivia is tonight! 6-9pm 7 Social SF See you there!
Headed to #OpenSourceAIWeek and #PyTorchCon? Our engineer, Anish Maddipoti highlights his top 5 must attend developer meetup and coding events: 1️⃣ Infra at scale with @dstackai & Lamda Labs 2️⃣ Happy hour trivia with @DeepInfra, @vllm_project, NVIDIA 3️⃣ Hands-on fine-tune with
1
0
2