かまろ/Camaro
@mlaass1
Followers
3K
Following
9K
Media
405
Statuses
6K
Kaggle Grandmaster
Osaka, Japan
Joined November 2018
また今日もcodex君が謎ロジックを展開してきた上に全く誤りを認めないので、徹底的に戦い続けて最終的に勝利してしまった。codex(というかGPT5)は誤ったら死ぬ病にかかっているらしい。
0
0
12
ARC Prizeかなり面白いことに気付いたけどあと6日・・半年もあったのに何やってたんだ。 https://t.co/Hp9w8Oomux
kaggle.com
Create an AI capable of novel reasoning
0
0
1
非難したいというより、単純にどういう仕組みでこうなってるのか気になる。英語のバズり投稿を監視して翻訳して投稿するbotみたいなのが売ってるのだろうか?
0
0
7
この論文��MIT所属の著者は含まれていない。多分バズったこのポスト(同様に生成AIによる自動投稿?)をパクって翻訳した結果、誤情報が伝播している。 https://t.co/EVF6mn4hyu
1
5
29
This does not diminish the algorithm or the paper’s claims, and it does not affect inference speed as the model selects a single embedding per puzzle. But calling it “7M” for ARC-AGI is a misunderstanding. https://t.co/IL1UM267Pl
github.com
Hi, thanks for sharing this interesting work! I have a question regarding the reported model size for the ARC-AGI setup. The paper mentions that the model has only 7M parameters, but when including...
0
0
18
This is not 7M params for ARC-AGI, +400M. 7M applies to single puzzle like Sudoku or Maze. For ARC-AGI, it stores a puzzle embedding for every puzzle, including augmented ones. For ARC1 that was +800,000 puzzles × 512 dimensions.
We verified the TRM results on the semi private holdout set and they're legit Awesome work and contribution to the open source community by @jm_alexia My notes: * This model is tiny! 7M params, but the rub is that it is relatively expensive to run because pre-training and
1
3
50
Does ImageNet speed run exists?
New CIFAR-10 training speed record: 94% in 1.99 seconds on one A100 Previous record: 2.59 seconds (Nov. 10th 2024) New record-holder: Algorithmic discovery engine developed by @hivergeai Changelog: - Muon: Vectorize NS iter and reduce frequency of 'normalize weights' step 1/3
1
1
1
I was puzzled why it takes almost 300 H100 hours to train a 7M params model, but now it makes sense.
0
0
1
Maybe it’s just a mistake. The parameter count is computed here ( https://t.co/0syEcZSCJF), but the embeddings are stored as buffers ( https://t.co/lQvlY3bmgz), not as nn.Parameters. Therefore, they’re not included when .numel() is called.
github.com
Contribute to SamsungSAILMontreal/TinyRecursiveModels development by creating an account on GitHub.
1
0
0
Wait, it claims to use a tiny 7M parameters NN, but why are the 400M+ puzzle embeddings (876,406 puzzles × 512 dim for ARC1) excluded? Is that a common practice?
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
1
0
2
Started to train TRM on single RTX4090 and about 24h later it's starting to get non zero score on ARC1 eval...
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
0
1
9
最近英会話(というかspeaking)の練習してるけど、母音の発音が難しすぎる。そもそも聞きわけられないので発音できるわけがない。
0
0
2