Zack Li-Nexa AI
@zacklearner
Followers
223
Following
264
Media
17
Statuses
210
Co-founder and CTO at Nexa AI, Industrial Veteran from Google & Amazon, and Stanford alumni. Committed to lifelong learning and advancing AI technology.
Joined October 2021
We just launched NexaSDK for Mobile on Product Hunt. Try out NexaSDK for Mobile and we’d love your feedback:
producthunt.com
NexaSDK for Mobile lets developers use the latest multimodal AI models fully on-device on iOS & Android apps with Apple Neural Engine and Snapdragon NPU acceleration. In just 3 lines of code, build...
The next generation of mobile apps will run multimodal AI locally by default. Today, we’re making it practical for developers to ship. We just launched NexaSDK for Mobile on Product Hunt. Developers can run the latest multimodal AI models fully on-device in iOS & @Android apps
1
0
2
Introducing NexaSDK for iOS and MacOS, run and build with the latest multimodal AI models fully on-device, on @Apple Neural Engine, GPU, and CPU. This is the first and only SDK that enables developers to run the latest SOTA models on NPU across iPhones and Mac Laptops, achieving
6
20
105
Most driving stress doesn’t come from steering. It comes from everything else in and around the car — kids and pets, street signs, missing items, everyday chaos. Today’s self-driving copilots can’t help with that. You need another pair of eyes that watches the cabin and the
20
77
346
Today we're releasing AutoNeural-VL-1.5B — the world's first real-time multimodal model built for in-car AI. It runs fully local on the @Qualcomm SA8295P NPU with a software–hardware co-designed architecture, setting a new bar for speed and quality. AutoNeural redefines what AI
113
486
3K
Huge appreciation to our partners from @Microsoft
@GoogleDeepMind @Qualcomm @NVIDIA @IBM @AMD @Intel @Qwen and so many others who featured us on stages, blogs, and launches!
Happy Thanksgiving! This year has been wild in the best way — builders across X, Reddit, LinkedIn, Slack, and Discord pushed us, roasted us, inspired us, and ultimately helped shape NexaSDK and Hyperlink into what they are today. We read every comment, every benchmark, every
0
0
3
gpt-oss-20b running on Hexagon NPU via Nexa SDK 🔥
Finally, the GPT-OSS-20B now runs fully local on the @Qualcomm Hexagon NPU via NexaSDK, powered by the NexaML engine — available today exclusively in Hyperlink Pro as an NPU-only feature. With a single line of code, OEMs can ship ChatGPT-class intelligence at laptop power
0
0
0
Finally, the GPT-OSS-20B now runs fully local on the @Qualcomm Hexagon NPU via NexaSDK, powered by the NexaML engine — available today exclusively in Hyperlink Pro as an NPU-only feature. With a single line of code, OEMs can ship ChatGPT-class intelligence at laptop power
1
6
14
Nexa AI is a featured partner at @Microsoft Ignite 2025 — highlighted in the official Microsoft blog and live on the floor this week. We’re also demoing at the @Qualcomm booth, showing what’s now possible with on-device AI agents powered by our NexaSDK and Hyperlink Agent.
3
5
26
Ok… Hyperlink’s launch blew up way beyond what we expected. In the last 24 hours, we crossed 1.6M views, 6.4K likes, received recognition from industry leaders, and saw a ton of love from the community. We built Hyperlink to make your computer truly intelligent — an on-device
3
27
106
Thanks @nvidia @NVIDIA_AI_PC for promoting our Hyperlink product!
Your local AI agent, upgraded. @Nexa_ai's Hyperlink is accelerated by RTX AI PCs allowing for scans of gigabytes of local files in minutes — fast, private, and all on your device. Get started today #RTXAIGarage 👉 https://t.co/vj94k0Zg2q
0
0
2
Meet Hyperlink, the first AI super assistant that lives inside your computer. Your computer stores all your files and personal context. Hyperlink deeply understands them and gives cited answers instantly — like Perplexity for your local files. It turns your computer into a true
254
658
7K
Following the launch of the Nexa Android SDK, we ran a 10-minute LLM stress test on the Samsung S25 Ultra with Qualcomm Hexagon NPU: ⚙️ CPU: throttled from ~37 t/s → ~19 t/s at 42 °C ⚙️ NPU (Qualcomm Hexagon): held steady at ~90 t/s and 36–38 °C — 2–4× faster under load 🔋 Both
docs.nexa.ai
Start here to set up and explore the Nexa SDK for running latest models on Android devices.
We ran a 10-minute LLM test on Samsung S25 Ultra CPU vs @Qualcomm Hexagon NPU. In 3 minutes, the CPU hit 42 °C and throttled: throughput fell from ~37 t/s → ~19 t/s. The NPU stayed cooler (36–38 °C) and held a steady ~90 t/s — 2–4× faster than CPU under load. Same 10-min,
0
0
2
This Week at Nexa 🚀 — VLA model on IoT & Robotics NPU, Nexa Android SDK, and NexaStudio app that beats Apple Intelligence 1) World’s first vision-language-action model running locally on NPU (Robotics + IoT) with NexaML @huggingface’s SmolVLA now runs fully on the @Qualcomm
1
5
15
Today, we’re launching Android Java & Kotlin support for NexaSDK (Beta) — bringing the full power of on-device AI to billions @Android phones powered by @Qualcomm @Snapdragon chipsets. This is a major leap forward for the world’s largest mobile developer community: ✅ Seamless
Introducing NexaSDK for Android (Beta) — run the latest AI models locally, 9× more energy-efficient and 2× faster, on @Android devices, powered by the @Qualcomm Hexagon NPU. This is the first SDK to support NPU, GPU and CPU, unlocking the full power of every Android device — for
0
0
2
LFM2-1.2B models from Liquid AI are now running fully accelerated on Qualcomm NPUs via the NexaML engine — real-time performance with minimal memory use, right on the edge. Four new variants power everything from chat to document parsing: 💬 LFM2-1.2B – general chat & reasoning
github.com
Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and mor...
LFM2-1.2B models from @LiquidAI_ are now running on @Qualcomm NPU in NexaSDK, powered by NexaML engine. Four new edge-ready variants: - LFM2-1.2B — general chat and reasoning - LFM2-1.2B-RAG — retrieval-augmented local chat - LFM2-1.2B-Tool — structured tool calling and agent
0
0
3
NVIDIA sent us a 5090 so we can demo Qwen3-VL 4B & 8B GGUF. You can now run it in our desktop UI, Hyperlink, powered by NexaML Engine — the first and only framework that supports Qwen3-VL GGUF right now. We tried the same demo examples from the Qwen2.5-32B blog — the new
5
17
221