N8 Programs @N8Programs X Profile

N8 Programs

@N8Programs

Followers

7K

Following

3K

Media

394

Statuses

4K

Studying Applied Mathematics and Statistics at @JohnsHopkins. Currently interning at @RockefellerUniv.

Proxima Centauri B

Joined September 2022

Don't wanna be here? Send us removal request.

N8 Programs

@N8Programs

7 hours

the degree of composability an LLM has is far, far, less than what a human works with. but far greater than zero.

Air Katakana

@airkatakana

7 hours

chatgpt, claude, gemini, grok, etc have all read, comprehended, and nearly memorized every book in the world, and yet with current architectures and training techniques none of them have any truly novel knowledge to give us. really makes you think.

0

2

N8 Programs

@N8Programs

16 hours

perfect summary! 👌🧑‍🍳❤️.

xlr8harder

@xlr8harder

16 hours

absolutely disgusting low effort engagement slop

1

0

4

N8 Programs

@N8Programs

5 days

RT @allen_ai: Introducing IFBench, a benchmark to measure how well AI models follow new, challenging, and diverse verifiable instructions.….

0

50

0

N8 Programs

@N8Programs

5 days

total @kalomaze vindication.

Xiang Yue

@xiangyue96

6 days

People are racing to push math reasoning performance in #LLMs—but have we really asked why? The common assumption is that improving math reasoning should transfer to broader capabilities in other domains. But is that actually true?. In our study (, we

0

1

7

N8 Programs

@N8Programs

6 days

RT @Duderichy: Alan Turing was a world class runner:. "While working at Bletchley, Turing, a talented long-distance runner, occasionally ra….

0

397

0

N8 Programs

@N8Programs

6 days

i've done this the proper way - cosplay an AI. went as chatgpt for halloween. maybe i'll go as claude next time.

Falafel (No. 1 Samura Critic)

@farris206

7 days

AI cosplays. Are we fr.

0

N8 Programs

@N8Programs

7 days

mind you, no shade to the anthropic employees here - these are all completely logical reasons not to open source opus 3 if you are in their position.

0

4

N8 Programs

@N8Programs

7 days

or, the most likely fourth possibility, anthropic just doesn't want to open-source their exact architecture for a myriad of small reasons that aren't any of the grand ones above.

1

0

5

N8 Programs

@N8Programs

7 days

this is an intriguing tweet - opus 3's architecture being considered a trade secret could mean either:. - there's some special sauce.- it's just a vanilla transformer and anthropic wants to preserve the image of special sauce.- or opus is served at ridiculous margins.

Catherine Olsson

@catherineols

7 days

@jik_wtf Unfortunately Opus 3 is not so old a model that we're comfortable sharing its architecture publicly right now. Speaking in a personal capacity, I will advocate in 5+ years for it to be released :).

2

0

16

N8 Programs

@N8Programs

8 days

RT @willccbb: WOW! 🤯 this groundbreaking dataset from Meta’s Chief AI Scientist has revolutionized the way that we understand vision 👀 🚀 is….

0

47

0

N8 Programs

@N8Programs

9 days

oh god i did the thing - it isn't X, it's Y.

0

7

N8 Programs

@N8Programs

9 days

my issue w/ chatgpt-generated writing isn't the writing on its face - gpt-4o has a decent style. its that thousands of people having it write in this style drastically reduces the entropy of discourse.

1

0

11

N8 Programs

@N8Programs

11 days

mom getting into vibe coding

0

10

N8 Programs

@N8Programs

11 days

these are crazy numbers for a 13B active w/ only 80B.

Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)

@teortaxesTex

11 days

I've been saying that this shape is underrated. Qwen bros didn't do that so Tencent bros picked up the slack. 80B total, 13B active, 256K context, 71.2 GPQA-Diamond, pretty good quantization to FP8 and even INT4. Might be «DeepSeek-Medium» for those interested.

0

2

N8 Programs

@N8Programs

11 days

RT @xeophon_:

0

2

0

N8 Programs

@N8Programs

11 days

bro is the alexander hamilton of model implementations. how does he code like he's running out of time.

Prince Canuma

@Prince_Canuma

11 days

Last 2 weeks: . > Gemma3n.> Phi4mm vision working, now audio and a few optimisations missing .> Falcon H1 (Mamba + Transformers).> Bitnet metal kernel 90% faster on MLX compared to official Bitnet.cpp .> Falcon Bitnet .> Processed 34m samples and training a new secret model .>.

1

9

N8 Programs

@N8Programs

12 days

RT @vikhyatk: still blowing my mind how good object detection got in the last release

0

18

0

N8 Programs

@N8Programs

12 days

RT @xlr8harder: I'm so relieved training as fair use is winning.

0

2

0

N8 Programs

@N8Programs

12 days

RT @Azaliamirh: Introducing Weaver, a test time scaling method for verification! . Weaver shrinks the generation-verification gap through a….

0

47

0

N8 Programs

@N8Programs

12 days

RT @SunshineFiora: it seems like it would be extremely good for the alignment community to run public experiments in post-training open sou….

0

8

0