Adam Ritter Profile
Adam Ritter

@ritteradam

Followers
398
Following
17K
Media
9
Statuses
336

Joined May 2009
Don't wanna be here? Send us removal request.
@ritteradam
Adam Ritter
3 days
@huskydogewoof @jm_alexia Have you tried comparing with the dynamic steps in my repo? I added some instructions to https://t.co/Y7MSW1Qdmm README to show how to use it for evals
Tweet card summary image
github.com
Contribute to adamritter/TinyRecursiveModels development by creating an account on GitHub.
2
3
7
@jm_alexia
Alexia Jolicoeur-Martineau
3 days
This is legit insane and it makes so much sense. This idea will be a key piece for true machine intelligence.
3
5
105
@jm_alexia
Alexia Jolicoeur-Martineau
3 days
Insane finding! You train on at most 16 improvement steps at training, but at inference you do as many steps as possible (448 steps) and you reach crazy accuracy. This is how you build intelligence!!
@huskydogewoof
Benhao Huang
3 days
@jm_alexia @ritteradam Indeed, @jm_alexia @ritteradam I also find that simply increasing the number of inference steps, even when the model is trained with only 16, can substantially improve performance. (config: TRM-MLP-EMA on Sudoku1k; though the 16-step one only reached 84% instead of 87%)
17
41
424
@ritteradam
Adam Ritter
4 days
@jm_alexia Hey Alexia, congrats, your model is amazing! I could improve the Sudoku-Extreme example from 87% to 96% by improving on the evaluation, but the results so far haven't translated to the ARC dataset, because the q heads are not accurate enough there.
2
8
84
@jm_alexia
Alexia Jolicoeur-Martineau
23 days
New paper 📜: Tiny Recursion Model (TRM) is a recursive reasoning approach with a tiny 7M parameters neural network that obtains 45% on ARC-AGI-1 and 8% on ARC-AGI-2, beating most LLMs. Blog: https://t.co/w5ZDsHDDPE Code: https://t.co/7UgKuD9Yll Paper:
Tweet card summary image
arxiv.org
Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on...
137
654
4K
@ritteradam
Adam Ritter
4 months
It's using @htmx_org as its frontend library btw, so HTMX fans will love it I guess (or hate it worst case)
1
0
3
@ritteradam
Adam Ritter
4 months
You can try it at https://t.co/tyTc1LHgBo and play with a live reactive multi-user example (other people can see your changes instantly) at https://t.co/sog07VwieJ What do you guys think?
1
0
2
@ritteradam
Adam Ritter
4 months
I've been working on a reactive full stack web framework, and would love some feedback. The main idea is to get back to the old style of web development with just putting HTML and SQL together, but the SQL queries are reactive, which makes my tiny templating language reactive.
1
0
3
@ritteradam
Adam Ritter
9 months
I was playing a bit with the DeepSeek 1.5 bit model on my 128GB Mac. My basic result is that it can go up to 7 tok/s from 2 tok/s if only half of the experts are used (128 instead of 256): I used this command: build/bin/llama-cli --model
1
0
4
@ritteradam
Adam Ritter
10 months
How to tackle huge context with Scott numbers? Scott numbers are great, because they are affine untyped lamda expressions, but they are not practical for searching as serching from them takes exponential time. For the ,,why'' part let's look at the normal form of a Scott number:
0
0
5
@ritteradam
Adam Ritter
10 months
One of the interesting lambda calculus system that came through when I was reading about substructural logic was Light Affine Lambda Calculus. It's interesting because it restricts lambda expressions to ones that evaluate in polynomial time. My favourite paper is from TERUI,
0
0
2
@ritteradam
Adam Ritter
10 months
In the past days I was reading a bit more into linear calculus. I will write more about it, but I thought my favourite introdutory video for linear logic (after looking through all I can find on YouTube) by Paul Downen is worth a mention: https://t.co/PybaeMJqIZ
0
1
4
@ritteradam
Adam Ritter
10 months
I've got a lot of followers who are interested in HVM / ARC-AGI / proof synthesis thanks to @VictorTaelin , so I thought it would be great to share an update. I added arbitrary affine untyped lambda functions to the program search that I implemented, but then I got stuck when I
1
0
27
@__tinygrad__
the tiny corp
10 months
Great software isn't invented, it's discovered.
23
25
481
@VictorTaelin
Taelin
10 months
Huge unexpected breakthrough today The story: @ritteradam posted an HVM3 untyped λ-Calculus synthesizer on HOC's Discord that beats my version. More interestingly, it somehow used only 1 "duplication label", while my approach needed an arbitrary amount of these. Upon inspecting
@alexocheema
Alex Cheema - e/acc
10 months
@Antisimplistic TB5 cables aren't cheap. And if you do try to cheap out on them, you'll get rekt as they are unreliable and slower than advertised.
9
11
309
@MrCatid
catid
1 year
AGI achieved. ARC challenge has been completed using test time training:
32
68
1K
@EndWokeness
End Wokeness
1 year
BREAKING: Eyewitness tells BBC that he informed police, Secret Service about a suspicious man on a roof with a rifle. He was ignored. https://t.co/Cvfb7znZtZ
7K
58K
233K
@BullionStar
BullionStar
6 years
COMEX secures secrecy agreement with CFTC under FOIA not to release details to the public of its market maker program for the new 400 oz gold futures contract hatched with LBMA, because "Disclosure Would Likely Cause Competitive Harm to COMEX". Program begins tomorrow April 13.
50
216
409