Modal
@modal
Followers
22K
Following
3K
Media
232
Statuses
1K
AI infrastructure that developers love 💚 Bring your own code and run CPU, GPU, and data intensive compute at scale.
New York City
Joined July 2022
If you’re at Reinvent, I’m doing a quick presentation and demo of @modal tomorrow (Thu) at 1:30pm. Will also be on a panel at the Nvidia booth today (Wed) at 3:45 pm.
0
2
64
Ministral 3 tops the cost-to-performance charts and is well-suited for serverless deployments. With GPU snapshotting on Modal, median cold starts for Ministral 3B drop from 2 minutes to 12 seconds âš¡
1
2
5
We're an official launch partner of @MistralAI 🚀 Deploy Mistral 3 models instantly, with up to 10x faster cold starts using GPU snapshotting.
Introducing the Mistral 3 family of models: Frontier intelligence at all sizes. Apache 2.0. Details in 🧵
7
25
351
With Modal's elastic GPU infra, you can instantly deploy these models for enterprise-scale streaming transcription. Try it now: 1) Diarization with SortFormer: https://t.co/uADda5woL9 2) Transcription with MultiTalker:
modal.com
This example shows how to run a streaming multi-talker speech-to-text service using NVIDIA’s Parakeet Multi-talker model and Sortformer diarization model. The application transcribes audio in...
0
0
4
Running on B200s on @modal!
Here's a great example of WarpGrep outperforming semantic search + normal grep in Claude Prompt: do we upgrade rate limits for warp grep if they pay? Cursor (sem search + grep) - left (50s): Claude Opus 4.5 does many sequential greps and reads, taking 50s. Find's some relevant
1
2
44
Because we don’t already have enough toy file systems out there … I wrote my own Here’s llmfuse: a self-compressing filesystem backed by an LLM https://t.co/yb38kBMQJS
grohan.co
Every systems engineer at some point in their journey yearns to write a filesystem. This sounds daunting at first - and writing a battle-tested filesystem is hard - but the minimal surface area for a...
3
8
40
Read more in our blog:
modal.com
Turns out, good devex for agents looks a lot like good devex for humans.
0
0
13
Agents need good developer experience, too. Luckily, that seems to follow the same DX design principles we've developed for humans. Tight feedback, actionable errors, runnable examples, unified logic and infra, readable code. Build for humans, and agents will succeed.
1
6
64
Incredible work from the team at @chaidiscovery. We're thrilled that Modal is playing a part in powering this cutting-edge research 🚀
Today, we’re releasing new data showing that Chai-2 can design antibodies against challenging targets with atomic precision. >86% of our designs possess industry-standard drug-quality properties without any optimization. Thread👇
1
2
47
We’re at the AI Engineer Code Summit this week! Stop by booth S5 to meet the team. Can’t make the conference? Join us at the after-party on Friday 11/21, cohosted with @cerebras, @ExaAILabs, @joinwarp, and @getmetronome. RSVP:
luma.com
"Why are there no coffeeshops open late? But what if I want to co-work at night?!?" - everyone on Twitter Look no further! We are so excited to present, Cafe…
0
3
15
Introducing Locus: the first AI system to outperform human experts at AI R&D Locus conducts research autonomously over multiple days and achieves superhuman results on RE-Bench given the same resources as humans, as well as SOTA performance on GPU kernel & ML engineering tasks.
21
71
411
Read the full case study on our blog:
modal.com
Learn how Reducto used GPU memory snapshotting and flexible autoscaling to build fast multi-model pipelines.
0
1
6
Migrating 30+ models away from Kubernetes gave @reductoai a 3× improvement in P90 latency for their document intelligence workloads. Learn how they tuned their systems with: • GPU memory snapshotting • Independent autoscaling per model • Isolated autoscaling per customer
2
9
72