Kaiyes_ Profile Banner
kaiyes Profile
kaiyes

@Kaiyes_

Followers
1K
Following
47K
Media
1K
Statuses
9K

on device local llm - ex CTO - 10 years react native - building @bundaiapp

Learn Japanese with anime 👉
Joined June 2013
Don't wanna be here? Send us removal request.
@Kaiyes_
kaiyes
3 months
Built a fully local voice AI assistant on react native - Zero token cost. Speech-to-Text → LLM → Text-to-Speech. All on-device. All private. The Stack: - Whisper.RN for STT (OpenAI's Whisper ported to RN) - Ollama + Qwen model for local LLM inference - Supertonic (via
4
0
7
@Kaiyes_
kaiyes
19 hours
Worth investigating for those with nvidia 5060 gpu. its an entry level card costing around $400-$500
@daniel_mac8
Dan McAteer
2 days
These ultra geniuses beat Sonnet 4.5 performance on LiveCodeBench with Qwen3-14B, a single RTX 5060 and a great harness.
0
0
0
@Kaiyes_
kaiyes
1 day
If you are considering buying the intel 32 GB gpu that is under $1k, then read this before you buy.
@TheAhmadOsman
Ahmad
1 day
People keep saying “VRAM is all that matters” for local LLMs > It’s not just wrong, it’s misleading When running LLMs locally, the bottleneck is NOT just “VRAM size” It’s: - memory bandwidth - interconnect (PCIe vs NVLink vs RDMA) - inference engine (vLLM,
0
0
0
@Kaiyes_
kaiyes
2 days
I played this game around 1996 ! 30 years ago !
@PixelCNinja
Pixel Cherry Ninja
6 days
Visco’s Goal! Goal! Goal! is a spectacular Neo Geo soccer classic that captures the pure spirit of arcade sports. With its fast-paced gameplay, powerful special shots, and vibrant graphics, it delivers a thrilling international tournament experience.
0
0
0
@Zai_org
Z.ai
2 days
GLM-5.1 is available to ALL GLM Coding Plan users! https://t.co/E63z53nXOX
345
553
5K
@Kaiyes_
kaiyes
3 days
meet Intel's Arc pro B70 - 32 GB Vram below $999. there are pros & cons to it though - availability, software support etc. Supports vLLm from the get go. Hopefully rocM or vulkan picks up the pace with it. Don't have much hope for intel to provide the software support
0
0
0
@moondreamai
moondream
3 days
VLMs too slow for production? Not anymore: 46ms end-to-end inference, 60+ fps on a single H100. Introducing Photon, Moondream's inference engine. Runs on everything from edge to server. https://t.co/UTt6vQOzOY
39
127
1K
@Kaiyes_
kaiyes
4 days
if ADHD had a face https://t.co/VhQiOpy6ur
0
0
0
@Kaiyes_
kaiyes
10 days
Super happy with hermes agent. up & running in like 5 minutes using local QWEN 9B Q4 model. With openclaw, the setup is so tedious. Thanks to @sudoingX , I gave it a try
8
1
31
@bradmillscan
Brad Mills 🔑⚡️
12 days
How in the hell are these accounts claiming to run entire companies w/ OpenClaw. I just spent 1.5 HRS trying to get my claw to use X API for reading tweets. FAIL We have a whole SOP documenting exactly how to do it from previous failures. I can’t imagine running 10 of these…
214
5
420
@Kaiyes_
kaiyes
12 days
agreed 💯. thats what I have been doing too
@Zeneca
Zeneca🔮
13 days
I'm convinced if you want to maximize productivity, you shouldn't be using openclaw or hermes - they take so much time bug fixing that you're better off just using claude code/codex directly there's maybe 1% of people who are the exception to this
0
0
0
@Kaiyes_
kaiyes
13 days
new twitter algorithm is really good ! it picked up my slight interests & put tweets in my timeline from amazing accounts. my feed is now all AI + anime. Normally it puts me in the react native Jail. Never knew these many great manga/anime accounts were there in twitter
@136Division
☯︎Cyber Daoist☯︎扎心老铁
13 days
The aesthetics of Japan's Showa era are unparalleled.
0
0
1
@mathemagic1an
Jay Hack
14 days
Transformers are Turing complete and can be trained to run arbitrary programs Turns out you can embed a relatively efficient assembly interpreter in the forward pass. This allows the LLM to execute deterministic code at inference time in its own weights, no sandbox
@ChristosTzamos
Christos Tzamos
18 days
1/4 LLMs solve research grade math problems but struggle with basic calculations. We bridge this gap by turning them to computers. We built a computer INSIDE a transformer that can run programs for millions of steps in seconds solving even the hardest Sudokus with 100% accuracy
30
97
1K
@Kaiyes_
kaiyes
17 days
take a look guys !
@tensorfish
tensorfish
18 days
QWEN 3.5 4b MLX running locally with a milk avatar Model by @toxsam
0
0
0
@Kaiyes_
kaiyes
23 days
had to post this
@fenglank
鳳蘭🌱fènglán
23 days
@souljagoyteller who the hell is claude?
1
0
1
@Kaiyes_
kaiyes
24 days
I feel people are underestimating the complexity of software. Like AI literally is limited by: - compute - model intelligence - harness - prompt accuracy - context - cost - hallucinations Yes, I’m circumventing all of these & yes I have not hand coded in a year.
0
0
0
@Kaiyes_
kaiyes
25 days
Who needs those new 128gb ram you say ? I have been going deep into using chinese llms as part of my toolchain. Claude/codex is the main driver. It writes the scripts. But then workhorse are these small LLMs. For example: I use LLMs + Node.js to: - cut me clips from anime -
0
0
2
@Kaiyes_
kaiyes
27 days
Crazy how fast these 9B & 4B models came out !! I was okay with having a 27B model. But this is chef's kiss !
@Alibaba_Qwen
Qwen
27 days
🚀 Introducing the Qwen 3.5 Small Model Series Qwen3.5-0.8B · Qwen3.5-2B · Qwen3.5-4B · Qwen3.5-9B ✨ More intelligence, less compute. These small models are built on the same Qwen3.5 foundation — native multimodal, improved architecture, scaled RL: • 0.8B / 2B → tiny, fast,
0
0
0
@Kaiyes_
kaiyes
28 days
I’m gonna let it out of my chest. whats the point of cross platform if AI can build 2 apps at the same time ? Make a swift app then exactly make a kotlin app ?
4
0
2
@Kaiyes_
kaiyes
1 month
As if anthropic doesn't steal !
0
0
0