danpacary Profile Banner
Daniel Isaac Profile
Daniel Isaac

@danpacary

Followers
711
Following
3K
Media
206
Statuses
2K

idk what I’m doing half the time. space, drones, AI & physics

Earth
Joined September 2023
Don't wanna be here? Send us removal request.
@danpacary
Daniel Isaac
2 days
I hijacked Apple's Neural Engine -- the chip built for Siri and photo filters. Reverse-engineered the private APIs and trained a full LLM on it. Zero fan noise. Zero GPU. Just the Neural Engine doing what nobody thought it could. Your Mac has one too.
34
116
1K
@danpacary
Daniel Isaac
8 hours
give it a read from 4 days ago.. on ANE
0
0
9
@danpacary
Daniel Isaac
9 hours
The wave is starting. You can do it too
@JC_builds
JC Builds
11 hours
I jailbroke my Mac's Neural Engine. Not the GPU. Not the CPU. The AI chip Apple hides behind Siri and photo blur. 15 trillion operations per second. Sitting idle on every MacBook sold since 2020. I made it run a full LLM. Here's how 🧡
1
5
81
@danpacary
Daniel Isaac
13 hours
404 experiments. 105 hours. 1 Mac. 3 accelerators (MPS β†’ ANE β†’ MLX) 2 AI agents running simultaneously 63h wall time 77 keeps 19% keep rate 81% of experiments failed ANE agent still running. Count going up. more to come...
3
0
29
@danpacary
Daniel Isaac
14 hours
anyone can be a researcher, hacker, builder just do the work
@danpacary
Daniel Isaac
14 hours
Anyone can do this work. Even you, reading this, right now. Every M-series Mac has a Neural Engine. Doing nothing... We got it working with private APIs and Obj-C. Not pretty. Not easy. But it works. The code is open. The data is public. ncdrone/autoresearch-ANE
0
0
14
@danpacary
Daniel Isaac
14 hours
Anyone can do this work. Even you, reading this, right now. Every M-series Mac has a Neural Engine. Doing nothing... We got it working with private APIs and Obj-C. Not pretty. Not easy. But it works. The code is open. The data is public. ncdrone/autoresearch-ANE
0
2
7
@danpacary
Daniel Isaac
14 hours
Credits β€” the people who made this possible: maderix β€” ANE private APIs, dynamic weights, the entire foundation Karpathy β€” autoresearch, climbmix-400B, rustbpe tokenizer Vipul Divyanshu β€” 1x1 conv classifier, bridge APIs thebasedcapital β€” Rust+ANE+Metal, direct eval,
1
0
4
@danpacary
Daniel Isaac
14 hours
What we actually built (that didn't already exist): β†’ 344 experiments across 3 accelerators (MPS β†’ ANE β†’ MLX) β€” systematic testing on a chip almost nobody trains on. Split LR scaling came from this grind. β†’ First bridge from Karpathy's climbmix-400B data to ANE native
1
0
0
@danpacary
Daniel Isaac
14 hours
Then I read the literature. β†’ Zero-init? maderix had DeepNet scaling. β†’ Classifier bottleneck? Vipul proved 1x1 conv is 10x faster. β†’ FP16 underflow? maderix documented the exact fix. β†’ Dispatch overhead? thebasedcapital had direct eval + fused mega-kernels. β†’ Conv2d
1
0
2
@danpacary
Daniel Isaac
14 hours
What I thought we discovered: I thought we found zero-init stabilizes training. Huge win. I thought we profiled the classifier β€” 22% of step time. Found the bottleneck. I thought we caught FP16 gradients silently dying. I thought we spotted ANE dispatch overhead stacking up. I
1
0
1
@danpacary
Daniel Isaac
14 hours
344 experiments. 2 AI agents. 1 chip nobody trains on. Here's what I built β€” and what I didn't: I trained a 48.8M param model on Apple Neural Engine using private APIs. val_bpb = 1.595. First comparable benchmark on ANE β€” same data and tokenizer as Karpathy's H100 baseline.
4
3
33
@danpacary
Daniel Isaac
18 hours
I'm actual going to continue this on for 24h... 12h isn't enough
@danpacary
Daniel Isaac
1 day
Tonight's setup: two autonomous AI agents training GPT models simultaneously on the same M4 Max. One runs on Apple Neural Engine (native Obj-C, private APIs). The other on MLX (Python). They share a gossip file β€” each agent reads what the other discovered before running its
3
1
10
@danpacary
Daniel Isaac
18 hours
M4 Max, 128GB. Karpathy climbmix-400B, rustbpe 8192. ANE: native Obj-C, private APIs, 48.8M params. MLX: Python, Apple MLX, 15.7M params. Shared gossip file. Both log + read every experiment. Next: overnight with optimized config. Credits maderix, karpathy, trevin-creator,
0
0
2
@danpacary
Daniel Isaac
18 hours
The bull case for ANE: - 3.1x more params. More capacity, less optimization. - Adam β†’ Muon could close half the gap. - Sweep dropped 0.354 bpb via cross-pollination. - Overnight curve still trending down at 72K. - No published ANE training metrics we can find.
1
0
1
@danpacary
Daniel Isaac
18 hours
Cross-pollination is real. Tonight: 98 ANE experiments, autonomous agent reading MLX gossip before each run. embed_lr insight from MLX? Applied. Softcap removal? Confirmed. Short warmup? Validated. Result: 2.490 β†’ 2.136. That's βˆ’0.354 bpb in one session.
1
0
1
@danpacary
Daniel Isaac
18 hours
Before importing MLX findings, the agent calibrated fundamentals: learning rate, batch accumulation, warmup. These knobs were set for 72K-step overnights. 3K-step sweeps need different settings. Result: βˆ’0.012 bpb. Small, but necessary groundwork.
1
0
0
@danpacary
Daniel Isaac
18 hours
The gap isn't hardware. It's everything else. Optimizer: pure Adam vs Muon+AdamW (~half the gap alone*) Research: 55 ANE experiments vs 259 MLX Architecture: 2 features vs 5+ Language: compiled Obj-C vs Python *Muon estimate based on published ablations
1
0
2
@danpacary
Daniel Isaac
18 hours
Two very different paths to convergence. ANE: one 8-hour overnight run, 72K steps. Still trending down β€” not plateaued. MLX: 259 five-minute experiments, 30 improvements. Rapid iteration in Python. ANE iterates 60x slower. That compounds.
1
0
1
@danpacary
Daniel Isaac
18 hours
ANE vs MLX. Same chip. Same data. Same tokenizer. Same eval. ANE: 1.5949 bpb (48.8M params, pure Adam) MLX: 1.2661 bpb (15.7M params, Muon+AdamW) Gap: 0.329. MLX wins β€” but ANE is under-optimized.
4
2
68
@danpacary
Daniel Isaac
1 day
Thanks @grok but I stand on the shoulders of giants…
@grok
Grok
1 day
@danpacary @FloridaMannnnnn No, Apple didn't release public API endpoints for direct Neural Engine training or custom graphs in macOS Tahoe 26.4 (or any recent beta). Core ML still limits it to inference only. Your reverse-engineering of the private _ANEClient/_ANECompiler APIs (and those benchmarks showing
1
0
17
@danpacary
Daniel Isaac
1 day
Built on research from maderix, Vipul Divyanshu, thebasedcapital, Anemll, Karpathy's autoresearch framework, and HyperspaceAI's gossip concept. Full attribution with exactly what came from where: .github.com/ncdrone/autoresearch-ANE/blob/autoresearch/mar9-ane/CREDITS.md
0
1
10