Alex Yeh @alex_yehya X Profile

Alex Yeh

@alex_yehya

Followers

476

Following

492

Media

257

Statuses

1K

Founder/CEO of GMI Cloud. Recently named as 1 of 6 Nvidia Reference Platform Partner. Building world’s fastest AI native cloud across 7 DCs & 5 countries

https://t.co/Z9FIsPUJHP

Bay Area

Joined March 2013

Don't wanna be here? Send us removal request.

Alex Yeh

@alex_yehya

8 hours

Every transformative tech goes through cycles. What matters is who's building for the long term—infrastructure, execution, and real customer value outlast the hype. https://t.co/GhGMNwqjAw

Rohan Paul

@rohanpaul_ai

3 days

Google DeepMind CEO Demis Hassabis on AI bubble, from his new interview yesterday. Says some AI startups with tens of billions of valuations are wildly overpriced — and a correction may come. AI is overhyped in the short term, underappreciated in the medium to long term. An “AI

0

Alex Yeh

@alex_yehya

9 hours

Capability up, cost down at log scale—this trajectory changes everything. As models get cheaper, infrastructure that delivers them efficiently becomes the competitive moat. https://t.co/8Cm92cPufB

Ethan Mollick

@emollick

2 days

No signs of an end to rapid gains in AI ability at ever-decreasing costs (which is a log scale) yet. I have to update this monthly or more frequently at this point. All AI benchmarks are flawed, but GPQA Diamond has been a pretty good one, though likely close to being maxed out.

0

Alex Yeh

@alex_yehya

10 hours

Impressive launch from xAI. Real-time voice agents with tool calling and search capabilities at this price point will unlock entirely new application categories. https://t.co/yvcMEgr2xF

xAI

@xai

2 days

Today, we're excited to launch the Grok Voice Agent API, empowering developers to build voice agents that speak dozens of languages, call tools, and search realtime data. https://t.co/7c7SLYzvum

0

Alex Yeh

@alex_yehya

1 day

Our engineer Grace is going live now, demoing how to use GMI Studio. Join and leave your comments!

0

GMI Cloud

@gmi_cloud

1 day

We’re going live this afternoon with a GMI Studio (Beta) walkthrough Our engineers will build workflows in real time, show how features behave, and answer questions as they come up. If you’re trying to understand what GMI Studio can actually do (and where it’s still evolving),

1

5

10

Alex Yeh

@alex_yehya

1 day

GMI Studio (Beta) reflects how we think creative workflows should feel — focused on making, not setup. Excited to start opening it up and learning from the creators who join early.

GMI Cloud

@gmi_cloud

1 day

We’ve been quietly working on something for creators. Today, we’re introducing GMI Studio (Beta) — a cloud-native evolution of ComfyUI. It’s built for creators running heavier image, video, and multimodal workflows, without turning setup into the project. If you’ve hit local

0

2

Alex Yeh

@alex_yehya

1 day

Impressive benchmarks from Gemini 3 Flash. The frontier intelligence at accessible pricing continues to push what's possible for developers building AI applications. https://t.co/a0Es7amUO5

Logan Kilpatrick

@OfficialLoganK

2 days

Introducing Gemini 3 Flash, our frontier intelligence model, available at scale for everyone. It excels at coding, tool calling, and is stronger than 2.5 Pro across most metrics!! ⚡️ Available in the API at $0.50 in / 1M tokens and $3.00 out / 1M tokens across.

0

Alex Yeh

@alex_yehya

1 day

Interactive 3D worlds in real-time? The bottleneck isn't just the model—it's inference latency across regions. Multimodal applications need infrastructure optimized for streaming. https://t.co/1PtKWW7nbX

Tencent HY

@TencentHunyuan

3 days

🚀🚀🚀Introducing HY World 1.5 (WorldPlay)! We have now open-sourced the most systemized, comprehensive real-time world model framework in the industry. In HY World 1.5, we develop WorldPlay, a streaming video diffusion model that enables real-time, interactive world modeling

0

Alex Yeh

@alex_yehya

1 day

OpenAI bringing image gen to ChatGPT mainstream accelerates multimodal adoption. Infrastructure that can handle this surge in mixed workloads becomes critical. https://t.co/lUpHWJOa9L

OpenAI

@OpenAI

3 days

Introducing ChatGPT Images, powered by our flagship new image generation model. - Stronger instruction following - Precise editing - Detail preservation - 4x faster than before Rolling out today in ChatGPT for all users, and in the API as GPT Image 1.5.

0

Alex Yeh

@alex_yehya

2 days

Meta's SAM Audio marks a shift toward real-time multimodal AI. Isolating audio from complex mixtures in milliseconds demands serious inference infrastructure—latency and throughput matter more than ever. https://t.co/kmigm69pIt

Meta Newsroom

@MetaNewsroom

3 days

Introducing SAM Audio: the first unified AI model that allows you to isolate and edit sound from complex audio mixtures. This could mean isolating the guitar in a video of your band, filtering out traffic noises, or removing the sound of a dog barking in your podcast, all with

0

Alex Yeh

@alex_yehya

2 days

Non-reasoning models like KAT-Coder-Pro V1 achieving 64 intelligence score with 90% fewer tokens signals a shift. Efficient models need efficient infrastructure—globally deployed, sub-100ms latency 🌍 https://t.co/GzxLVWHllL

Artificial Analysis

@ArtificialAnlys

4 days

KAT-Coder-Pro V1 by KwaiKAT is a non-reasoning model which demonstrates impressive performance across reasoning-type tasks, while using significantly fewer output tokens than peers. It scores 64 on the Artificial Analysis Intelligence Index, the highest of any non-reasoning

0

1

Alex Yeh

@alex_yehya

2 days

MiMo-V2-Flash: 309B params, 15B active, hitting 150 tokens/s with DeepSeek-level performance. MoE efficiency is rewriting the inference playbook—the infrastructure that powers this matters more than ever. https://t.co/j5kGel9lF6

XiaomiMiMo

@XiaomiMiMo

4 days

⚡ Faster than Fast. Designed for Agentic AI. Introducing Xiaomi MiMo-V2-Flash — our new open-source MoE model: 309B total params, 15B active. Blazing speed meets frontier performance. 🔥 Highlights: 🏗️ Hybrid Attention: 5:1 interleaved 128-window SWA + Global | 256K context 📈

0

Alex Yeh

@alex_yehya

3 days

Congratulations to Mirelo on the $41M seed round! Excited to see what you build next, and proud to support your journey with GMI Cloud 🤝

a16z

@a16z

4 days

We’re co-leading a $41M seed round in Mirelo, a foundation model company focused on the sound layer for video. Mirelo stands out as a clear leader on the frontier of AI-generated sound. Their team combines in-depth AI research experience from some of the best labs in the world

0

1

Alex Yeh

@alex_yehya

3 days

Audio AI moving from impressive demos to production reliability—89% fewer hallucinations and 35% lower error rates matter more than raw capability. This is how AI gets real. https://t.co/6JOxmFbm8A

OpenAI Developers

@OpenAIDevs

4 days

🆕 New audio model snapshots are now live in the Realtime API with improvements to reliability, lower error rates, and fewer hallucinations: - gpt-4o-mini-transcribe-2025-12-15: 89% reduction in hallucinations compared to whisper-1 - gpt-4o-mini-tts-2025-12-15: 35% fewer word

0

Alex Yeh

@alex_yehya

3 days

NVIDIA's Nemotron 3 pushes inference efficiency forward. As these models get faster, deployment infrastructure matters more—getting them into production at scale is where it counts. https://t.co/WTCagL7jF4

Bryan Catanzaro

@ctnzr

5 days

Today, @NVIDIA is launching the open Nemotron 3 model family, starting with Nano (30B-3A), which pushes the frontier of accuracy and inference efficiency with a novel hybrid SSM Mixture of Experts architecture. Super and Ultra are coming in the next few months.

0

Alex Yeh

@alex_yehya

3 days

Wan 2.6 is live. 1080p video, multi-shot scenes, consistent characters—all accessible through GMI Cloud's API. Building video features into your app just got easier. https://t.co/61cwUwEz7Z

console.gmicloud.ai

GPU cloud solutions for AI training, inference, and deployment. GMI Cloud is a trusted cloud GPU provider offering high-performance infrastructure at scale.

0

GMI Cloud

@gmi_cloud

4 days

dinner drop Kling 2.6 is live on GMI Cloud promos in 4 days watch the video and make your own Jeopardy show @Kling_ai

4

10

32

Alex Yeh

@alex_yehya

4 days

The real divide isn't ideological—it's between those who can access compute to test their ideas vs those who can't. Infrastructure democratization matters more than which camp you're in. https://t.co/8rjYniZgMk

Haider.

@slow_developer

6 days

the AI debate is split into camps: doomers, ethicists, builders, pragmatists, and skeptics some want to slow everything down, some think LLMs are all just fancy autocomplete, others are convinced AGI is basically tomorrow but the weird part is we're all watching the same

0

Alex Yeh

@alex_yehya

4 days

What a year for open models! From DeepSeek to Qwen to Kimi—the East-West collaboration is accelerating. Infrastructure that connects both ecosystems seamlessly will be the real winner in 2026. https://t.co/evp9GbuiRB

Nathan Lambert

@natolambert

5 days

Open models year in review What a year! We're back with an updated open model builder tier list, our top models of the year, and our predictions for 2026. First, the winning models: 1. DeepSeek R1 (@deepseek_ai): Transformed the AI world 2. Qwen 3 Family (@AlibabaGroup): The new

0

Alex Yeh

@alex_yehya

4 days

Google's making real-time translation universal. Now imagine what thousands of developers could build with the same global inference infrastructure. The next wave of AI apps needs this foundation. https://t.co/trM9w6bj3S

Shay Boloor

@StockSavvyShay

7 days

$GOOGL just expanded live speech translation in Google Translate from Pixel Buds to any headphones. Real-time translations now work across 70+ languages making it a platform-level feature rather than a device-specific one.

0