kaiyes @Kaiyes_ X Profile

kaiyes

@Kaiyes_

Followers

1K

Following

47K

Media

1K

Statuses

9K

on device local llm - ex CTO - 10 years react native - building @bundaiapp

https://t.co/TfNm79XTTg

Learn Japanese with anime 👉

Joined June 2013

Don't wanna be here? Send us removal request.

kaiyes

@Kaiyes_

3 months

Built a fully local voice AI assistant on react native - Zero token cost. Speech-to-Text → LLM → Text-to-Speech. All on-device. All private. The Stack: - Whisper.RN for STT (OpenAI's Whisper ported to RN) - Ollama + Qwen model for local LLM inference - Supertonic (via

4

0

7

kaiyes

@Kaiyes_

19 hours

Worth investigating for those with nvidia 5060 gpu. its an entry level card costing around $400-$500

Dan McAteer

@daniel_mac8

2 days

These ultra geniuses beat Sonnet 4.5 performance on LiveCodeBench with Qwen3-14B, a single RTX 5060 and a great harness.

0

kaiyes

@Kaiyes_

1 day

If you are considering buying the intel 32 GB gpu that is under $1k, then read this before you buy.

Ahmad

@TheAhmadOsman

1 day

People keep saying “VRAM is all that matters” for local LLMs > It’s not just wrong, it’s misleading When running LLMs locally, the bottleneck is NOT just “VRAM size” It’s: - memory bandwidth - interconnect (PCIe vs NVLink vs RDMA) - inference engine (vLLM,

0

kaiyes

@Kaiyes_

2 days

I played this game around 1996 ! 30 years ago !

Pixel Cherry Ninja

@PixelCNinja

6 days

Visco’s Goal! Goal! Goal! is a spectacular Neo Geo soccer classic that captures the pure spirit of arcade sports. With its fast-paced gameplay, powerful special shots, and vibrant graphics, it delivers a thrilling international tournament experience.

0

Z.ai

@Zai_org

2 days

GLM-5.1 is available to ALL GLM Coding Plan users! https://t.co/E63z53nXOX

345

553

5K

kaiyes

@Kaiyes_

3 days

meet Intel's Arc pro B70 - 32 GB Vram below $999. there are pros & cons to it though - availability, software support etc. Supports vLLm from the get go. Hopefully rocM or vulkan picks up the pace with it. Don't have much hope for intel to provide the software support

0

moondream

@moondreamai

3 days

VLMs too slow for production? Not anymore: 46ms end-to-end inference, 60+ fps on a single H100. Introducing Photon, Moondream's inference engine. Runs on everything from edge to server. https://t.co/UTt6vQOzOY

39

127

1K

kaiyes

@Kaiyes_

4 days

if ADHD had a face https://t.co/VhQiOpy6ur

0

kaiyes

@Kaiyes_

10 days

Super happy with hermes agent. up & running in like 5 minutes using local QWEN 9B Q4 model. With openclaw, the setup is so tedious. Thanks to @sudoingX , I gave it a try

8

1

31

Brad Mills 🔑⚡️

@bradmillscan

12 days

How in the hell are these accounts claiming to run entire companies w/ OpenClaw. I just spent 1.5 HRS trying to get my claw to use X API for reading tweets. FAIL We have a whole SOP documenting exactly how to do it from previous failures. I can’t imagine running 10 of these…

214

5

420

kaiyes

@Kaiyes_

12 days

agreed 💯. thats what I have been doing too

Zeneca🔮

@Zeneca

13 days

I'm convinced if you want to maximize productivity, you shouldn't be using openclaw or hermes - they take so much time bug fixing that you're better off just using claude code/codex directly there's maybe 1% of people who are the exception to this

0

kaiyes

@Kaiyes_

13 days

new twitter algorithm is really good ! it picked up my slight interests & put tweets in my timeline from amazing accounts. my feed is now all AI + anime. Normally it puts me in the react native Jail. Never knew these many great manga/anime accounts were there in twitter

☯︎Cyber Daoist☯︎扎心老铁

@136Division

13 days

The aesthetics of Japan's Showa era are unparalleled.

0

1

Jay Hack

@mathemagic1an

14 days

Transformers are Turing complete and can be trained to run arbitrary programs Turns out you can embed a relatively efficient assembly interpreter in the forward pass. This allows the LLM to execute deterministic code at inference time in its own weights, no sandbox

Christos Tzamos

@ChristosTzamos

18 days

1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy

30

97

1K

kaiyes

@Kaiyes_

17 days

take a look guys !

tensorfish

@tensorfish

18 days

QWEN 3.5 4b MLX running locally with a milk avatar Model by @toxsam

0

kaiyes

@Kaiyes_

23 days

had to post this

鳳蘭🌱fènglán

@fenglank

23 days

@souljagoyteller who the hell is claude?

1

0

1

kaiyes

@Kaiyes_

24 days

I feel people are underestimating the complexity of software. Like AI literally is limited by: - compute - model intelligence - harness - prompt accuracy - context - cost - hallucinations Yes, I’m circumventing all of these & yes I have not hand coded in a year.

0

kaiyes

@Kaiyes_

25 days

Who needs those new 128gb ram you say ? I have been going deep into using chinese llms as part of my toolchain. Claude/codex is the main driver. It writes the scripts. But then workhorse are these small LLMs. For example: I use LLMs + Node.js to: - cut me clips from anime -

0

2

kaiyes

@Kaiyes_

27 days

Crazy how fast these 9B & 4B models came out !! I was okay with having a 27B model. But this is chef's kiss !

Qwen

@Alibaba_Qwen

27 days

🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast,

0

kaiyes

@Kaiyes_

28 days

I’m gonna let it out of my chest. whats the point of cross platform if AI can build 2 apps at the same time ? Make a swift app then exactly make a kotlin app ?

4

0

2

kaiyes

@Kaiyes_

1 month

As if anthropic doesn't steal !

0