BlinkDL @BlinkDL_AI X Profile

BlinkDL

@BlinkDL_AI

Followers

9K

Following

413

Media

192

Statuses

476

RWKV = 100% RNN with GPT-level performance. https://t.co/TkdxOJSFWX and https://t.co/86DzS6arA0

https://t.co/6ZHNbn36G4

Joined September 2022

Don't wanna be here? Send us removal request.

BlinkDL

@BlinkDL_AI

3 days

RWKV-7 G0a3 13.3B: strongest pure RNN ever, MMLU 76.0% (+CoT=82.5%), MATH500 76.0%, GSM8K 92.3%, MMLU Pro 49.8% (+CoT=61.6%), no eval-maxxing / mid-training / post-training. Download: https://t.co/oJxpuJoeQD Ollama: https://t.co/bScaOJSoD0

BlinkDL

@BlinkDL_AI

7 days

RWKV-7 G0a3 7.2B: pure RNN with MMLU 65.0% (+CoT=72.3%), MATH500 67.8%, GSM8K 83.9%, MMLU Pro 35.9% (+CoT=52.1%) and no eval-maxxing, no mid-training, no post-training. Download: https://t.co/oJxpuJoeQD and G0a3 13.3B release very soon 🙂

0

3

48

BlinkDL

@BlinkDL_AI

3 days

10250+ token/s RWKV-7 7.2B fp16 bsz960 @ RTX5090 123+ token/s RWKV-7 7.2B fp16 bsz1 @ RTX5090 https://t.co/YW3XbVuuCP always const speed & vram because we are RNN 🚀

BlinkDL

@BlinkDL_AI

7 days

RWKV-7 G0a3 7.2B: pure RNN with MMLU 65.0% (+CoT=72.3%), MATH500 67.8%, GSM8K 83.9%, MMLU Pro 35.9% (+CoT=52.1%) and no eval-maxxing, no mid-training, no post-training. Download: https://t.co/oJxpuJoeQD and G0a3 13.3B release very soon 🙂

1

5

38

BlinkDL

@BlinkDL_AI

3 days

Now 4 community ROSA 🌹 projects 🔥 https://t.co/QwXDJCfRPz https://t.co/H8lgDYi54T https://t.co/uTvtyxU6A8

github.com

Contribute to x-0D/RASP development by creating an account on GitHub.

BlinkDL

@BlinkDL_AI

10 days

RWKV7+ROSA 1M params solving 40 digits +/- with 99% digit accuracy, without CoT 🌹 demo: https://t.co/j0eFQDISvu

0

3

26

BlinkDL

@BlinkDL_AI

6 days

How RWKV-7 models evolve by adding better data 🙂

BlinkDL

@BlinkDL_AI

7 days

RWKV-7 G0a3 7.2B: pure RNN with MMLU 65.0% (+CoT=72.3%), MATH500 67.8%, GSM8K 83.9%, MMLU Pro 35.9% (+CoT=52.1%) and no eval-maxxing, no mid-training, no post-training. Download: https://t.co/oJxpuJoeQD and G0a3 13.3B release very soon 🙂

1

2

40

BlinkDL

@BlinkDL_AI

7 days

Ollama GGUF: https://t.co/bScaOJSoD0 Gradio Demo:

huggingface.co

0

6

BlinkDL

@BlinkDL_AI

7 days

RWKV-7 G0a3 7.2B: pure RNN with MMLU 65.0% (+CoT=72.3%), MATH500 67.8%, GSM8K 83.9%, MMLU Pro 35.9% (+CoT=52.1%) and no eval-maxxing, no mid-training, no post-training. Download: https://t.co/oJxpuJoeQD and G0a3 13.3B release very soon 🙂

BlinkDL

@BlinkDL_AI

1 month

RWKV-7 G1a 2.9B : pure RNN surpassing Gemma3 4B and Llama3.2 3B in some areas, supports two reasoning styles and length control. Download: https://t.co/oJxpuJoeQD 🚀 and G1a 1.5/0.4/0.1B & G0a 7B updated, G0 13B release in Oct

3

10

60

BlinkDL

@BlinkDL_AI

10 days

RWKV7+ROSA 1M params solving 40 digits +/- with 99% digit accuracy, without CoT 🌹 demo: https://t.co/j0eFQDISvu

BlinkDL

@BlinkDL_AI

16 days

RWKV7 vs RWKV7+ROSAv251020 vs RWKV7 + ROSAv251021 (same arch&params as v251020, better training method) 🚀

7

19

141

BlinkDL

@BlinkDL_AI

16 days

RWKV7 vs RWKV7+ROSAv251020 vs RWKV7 + ROSAv251021 (same arch&params as v251020, better training method) 🚀

BlinkDL

@BlinkDL_AI

18 days

Training finished. Solid improvements, though unstable (will fix). This is learning to + and - large random numbers (not using loss mask, so the loss will appear higher).

6

7

72

BlinkDL

@BlinkDL_AI

18 days

Training finished. Solid improvements, though unstable (will fix). This is learning to + and - large random numbers (not using loss mask, so the loss will appear higher).

1

13

BlinkDL

@BlinkDL_AI

18 days

Confirmed RWKV7+ROSA > RWKV7 and ROSA-grok spotted🤯

BlinkDL

@BlinkDL_AI

20 days

now i think i should try matching seqA with seqB to avoiding "matching matching" (very complicated behavior 😂 ) and of course one can match seqQ with seqK to fetch seqV

4

3

63

BlinkDL

@BlinkDL_AI

20 days

now i think i should try matching seqA with seqB to avoiding "matching matching" (very complicated behavior 😂 ) and of course one can match seqQ with seqK to fetch seqV

BlinkDL

@BlinkDL_AI

20 days

RWKV8 ROSA 🌹 simply scales, producing mysterious new languages. Training small LMs soon 🙂 Code: https://t.co/j0eFQDJql2

0

2

33

BlinkDL

@BlinkDL_AI

20 days

RWKV8 ROSA 🌹 simply scales, producing mysterious new languages. Training small LMs soon 🙂 Code: https://t.co/j0eFQDJql2

BlinkDL

@BlinkDL_AI

21 days

LM inventing inner monologue languages ✨ enabled by RWKV8 multi-layer ROSA 🌹 via fully end-to-end training (next-token prediction) 🚀 Code: https://t.co/ktMBpi2kfI

5

10

102

BlinkDL

@BlinkDL_AI

21 days

RWKV-8 ROSA Introduction:

BlinkDL

@BlinkDL_AI

26 days

RWKV-8 ROSA 🌹 mechanism: neurosymbolic infinite-range lossless information propagator beyond attention, enabling LLMs to invent their own inner monologue languages. First step towards scalable post-neural methods, for a new era in AI 🌌

0

5

BlinkDL

@BlinkDL_AI

21 days

LM inventing inner monologue languages ✨ enabled by RWKV8 multi-layer ROSA 🌹 via fully end-to-end training (next-token prediction) 🚀 Code: https://t.co/ktMBpi2kfI

1

4

60

BlinkDL

@BlinkDL_AI

22 days

RWKV8 ROSA training demo - the first serious neurosymbolic LM? for a new era in AI 🌌 Code: https://t.co/j0eFQDISvu

BlinkDL

@BlinkDL_AI

26 days

RWKV-8 ROSA 🌹 mechanism: neurosymbolic infinite-range lossless information propagator beyond attention, enabling LLMs to invent their own inner monologue languages. First step towards scalable post-neural methods, for a new era in AI 🌌

0

16

94

BlinkDL

@BlinkDL_AI

22 days

A working trainable ROSA layer using the 1bit + "local" gradient idea here 🙂

github.com

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it'...

BlinkDL

@BlinkDL_AI

24 days

RWKV-8 ROSA: How to Train It? (Oct 13, 2025)

0

5

24

BlinkDL

@BlinkDL_AI

24 days

RWKV-8 ROSA: How to Train It? (Oct 13, 2025)

BlinkDL

@BlinkDL_AI

26 days

RWKV-8 ROSA 🌹 mechanism: neurosymbolic infinite-range lossless information propagator beyond attention, enabling LLMs to invent their own inner monologue languages. First step towards scalable post-neural methods, for a new era in AI 🌌

0

8

55

BlinkDL

@BlinkDL_AI

26 days

RWKV-8 ROSA 🌹 mechanism: neurosymbolic infinite-range lossless information propagator beyond attention, enabling LLMs to invent their own inner monologue languages. First step towards scalable post-neural methods, for a new era in AI 🌌

BlinkDL

@BlinkDL_AI

29 days

The new mechanism in RWKV-8 "Heron" 🪶 is named ROSA (acronym, note SA ≠ Self-Attention here) 🌹 ROSA is compromise-free: we get efficient, scalable, genuine infinite ctx, by applying some beautiful algorithms.

16

56

355

BlinkDL

@BlinkDL_AI

28 days

By "everything" I mean reasoning/instruction/chat data, not test set 😂

0

9

BlinkDL

@BlinkDL_AI

28 days

RWKV-7 G1a 2.9B more evals: https://t.co/X2R2f6EeRB MMLU Pro 42% (+CoT), GSM8K 77%, MATH 50%. Note this is a base model, no mid-training, no post-training. I just add everything to pretraining dataset.

BlinkDL

@BlinkDL_AI

1 month

RWKV-7 G1a 2.9B : pure RNN surpassing Gemma3 4B and Llama3.2 3B in some areas, supports two reasoning styles and length control. Download: https://t.co/oJxpuJoeQD 🚀 and G1a 1.5/0.4/0.1B & G0a 7B updated, G0 13B release in Oct

2

3

45