Daniel Isaac @danpacary X Profile

Daniel Isaac

@danpacary

Followers

711

Following

3K

Media

206

Statuses

2K

idk what I’m doing half the time. space, drones, AI & physics

https://t.co/A8YuMxFwN2

Earth

Joined September 2023

Don't wanna be here? Send us removal request.

Daniel Isaac

@danpacary

2 days

I hijacked Apple's Neural Engine -- the chip built for Siri and photo filters. Reverse-engineered the private APIs and trained a full LLM on it. Zero fan noise. Zero GPU. Just the Neural Engine doing what nobody thought it could. Your Mac has one too.

34

116

1K

Daniel Isaac

@danpacary

8 hours

give it a read from 4 days ago.. on ANE

0

9

Daniel Isaac

@danpacary

9 hours

The wave is starting. You can do it too

JC Builds

@JC_builds

11 hours

I jailbroke my Mac's Neural Engine. Not the GPU. Not the CPU. The AI chip Apple hides behind Siri and photo blur. 15 trillion operations per second. Sitting idle on every MacBook sold since 2020. I made it run a full LLM. Here's how 🧵

1

5

81

Daniel Isaac

@danpacary

13 hours

404 experiments. 105 hours. 1 Mac. 3 accelerators (MPS → ANE → MLX) 2 AI agents running simultaneously 63h wall time 77 keeps 19% keep rate 81% of experiments failed ANE agent still running. Count going up. more to come...

3

0

29

Daniel Isaac

@danpacary

14 hours

anyone can be a researcher, hacker, builder just do the work

Daniel Isaac

@danpacary

14 hours

Anyone can do this work. Even you, reading this, right now. Every M-series Mac has a Neural Engine. Doing nothing... We got it working with private APIs and Obj-C. Not pretty. Not easy. But it works. The code is open. The data is public. ncdrone/autoresearch-ANE

0

14

Daniel Isaac

@danpacary

14 hours

Anyone can do this work. Even you, reading this, right now. Every M-series Mac has a Neural Engine. Doing nothing... We got it working with private APIs and Obj-C. Not pretty. Not easy. But it works. The code is open. The data is public. ncdrone/autoresearch-ANE

0

2

7

Daniel Isaac

@danpacary

14 hours

Credits — the people who made this possible: maderix — ANE private APIs, dynamic weights, the entire foundation Karpathy — autoresearch, climbmix-400B, rustbpe tokenizer Vipul Divyanshu — 1x1 conv classifier, bridge APIs thebasedcapital — Rust+ANE+Metal, direct eval,

1

0

4

Daniel Isaac

@danpacary

14 hours

What we actually built (that didn't already exist): → 344 experiments across 3 accelerators (MPS → ANE → MLX) — systematic testing on a chip almost nobody trains on. Split LR scaling came from this grind. → First bridge from Karpathy's climbmix-400B data to ANE native

1

0

Daniel Isaac

@danpacary

14 hours

Then I read the literature. → Zero-init? maderix had DeepNet scaling. → Classifier bottleneck? Vipul proved 1x1 conv is 10x faster. → FP16 underflow? maderix documented the exact fix. → Dispatch overhead? thebasedcapital had direct eval + fused mega-kernels. → Conv2d

1

0

2

Daniel Isaac

@danpacary

14 hours

What I thought we discovered: I thought we found zero-init stabilizes training. Huge win. I thought we profiled the classifier — 22% of step time. Found the bottleneck. I thought we caught FP16 gradients silently dying. I thought we spotted ANE dispatch overhead stacking up. I

1

0

1

Daniel Isaac

@danpacary

14 hours

344 experiments. 2 AI agents. 1 chip nobody trains on. Here's what I built — and what I didn't: I trained a 48.8M param model on Apple Neural Engine using private APIs. val_bpb = 1.595. First comparable benchmark on ANE — same data and tokenizer as Karpathy's H100 baseline.

4

3

33

Daniel Isaac

@danpacary

18 hours

I'm actual going to continue this on for 24h... 12h isn't enough

Daniel Isaac

@danpacary

1 day

Tonight's setup: two autonomous AI agents training GPT models simultaneously on the same M4 Max. One runs on Apple Neural Engine (native Obj-C, private APIs). The other on MLX (Python). They share a gossip file — each agent reads what the other discovered before running its

3

1

10

Daniel Isaac

@danpacary

18 hours

M4 Max, 128GB. Karpathy climbmix-400B, rustbpe 8192. ANE: native Obj-C, private APIs, 48.8M params. MLX: Python, Apple MLX, 15.7M params. Shared gossip file. Both log + read every experiment. Next: overnight with optimized config. Credits maderix, karpathy, trevin-creator,

0

2

Daniel Isaac

@danpacary

18 hours

The bull case for ANE: - 3.1x more params. More capacity, less optimization. - Adam → Muon could close half the gap. - Sweep dropped 0.354 bpb via cross-pollination. - Overnight curve still trending down at 72K. - No published ANE training metrics we can find.

1

0

1

Daniel Isaac

@danpacary

18 hours

Cross-pollination is real. Tonight: 98 ANE experiments, autonomous agent reading MLX gossip before each run. embed_lr insight from MLX? Applied. Softcap removal? Confirmed. Short warmup? Validated. Result: 2.490 → 2.136. That's −0.354 bpb in one session.

1

0

1

Daniel Isaac

@danpacary

18 hours

Before importing MLX findings, the agent calibrated fundamentals: learning rate, batch accumulation, warmup. These knobs were set for 72K-step overnights. 3K-step sweeps need different settings. Result: −0.012 bpb. Small, but necessary groundwork.

1

0

Daniel Isaac

@danpacary

18 hours

The gap isn't hardware. It's everything else. Optimizer: pure Adam vs Muon+AdamW (~half the gap alone*) Research: 55 ANE experiments vs 259 MLX Architecture: 2 features vs 5+ Language: compiled Obj-C vs Python *Muon estimate based on published ablations

1

0

2

Daniel Isaac

@danpacary

18 hours

Two very different paths to convergence. ANE: one 8-hour overnight run, 72K steps. Still trending down — not plateaued. MLX: 259 five-minute experiments, 30 improvements. Rapid iteration in Python. ANE iterates 60x slower. That compounds.

1

0

1

Daniel Isaac

@danpacary

18 hours

ANE vs MLX. Same chip. Same data. Same tokenizer. Same eval. ANE: 1.5949 bpb (48.8M params, pure Adam) MLX: 1.2661 bpb (15.7M params, Muon+AdamW) Gap: 0.329. MLX wins — but ANE is under-optimized.

4

2

68

Daniel Isaac

@danpacary

1 day

Thanks @grok but I stand on the shoulders of giants…

Grok

@grok

1 day

@danpacary @FloridaMannnnnn No, Apple didn't release public API endpoints for direct Neural Engine training or custom graphs in macOS Tahoe 26.4 (or any recent beta). Core ML still limits it to inference only. Your reverse-engineering of the private _ANEClient/_ANECompiler APIs (and those benchmarks showing

1

0

17

Daniel Isaac

@danpacary

1 day

Built on research from maderix, Vipul Divyanshu, thebasedcapital, Anemll, Karpathy's autoresearch framework, and HyperspaceAI's gossip concept. Full attribution with exactly what came from where: .github.com/ncdrone/autoresearch-ANE/blob/autoresearch/mar9-ane/CREDITS.md

0

1

10