Tambet Matiisen Profile
Tambet Matiisen

@tambetm

Followers
308
Following
2K
Media
26
Statuses
382

Tartu, Estonia
Joined September 2008
Don't wanna be here? Send us removal request.
@tambetm
Tambet Matiisen
7 days
RT @ID_AA_Carmack: This is good advice.
0
120
0
@tambetm
Tambet Matiisen
3 months
We have put a lot of effort into this summer school - please share!.
@unitartucs_adl
Autonomous Driving Lab
3 months
Ready to build a self-driving car this summer?.Join the Self-driving Cars Summer School at the University of Tartu. 📍 On-site in Tartu | 28 July – 8 August 2025.Application period: 1–30 April 2025.Course fee: 850 EUR.👉 Learn more and apply:
Tweet media one
0
1
0
@tambetm
Tambet Matiisen
6 months
RT @jxmnop: spent the last month building my own framework to train a diffusion model from scratch. it was hard. almost like i just learne….
0
38
0
@tambetm
Tambet Matiisen
6 months
RT @SeunghyunSEO7: The concept of critical batch size is quite simple. Let’s assume we have a training dataset with 1M tokens. If we use a….
0
87
0
@tambetm
Tambet Matiisen
9 months
RT @kvogt: At Cruise I built robotaxis that completed 250k driverless rides, including the first rides ever in a major city. With the Tesl….
0
306
0
@tambetm
Tambet Matiisen
10 months
RT @lexin_zhou: 1/ New paper @Nature!. Discrepancy between human expectations of task difficulty and LLM errors harms reliability. In 2022,….
0
307
0
@tambetm
Tambet Matiisen
1 year
RT @Azaliamirh: Is inference compute a new dimension for scaling LLMs?. In our latest paper, we explore scaling inference compute by increa….
0
67
0
@tambetm
Tambet Matiisen
1 year
RT @karpathy: Jagged Intelligence. The word I came up with to describe the (strange, unintuitive) fact that state of the art LLMs can both….
0
397
0
@tambetm
Tambet Matiisen
1 year
RT @jon_barron: The legendary Ross Girshick just posted his CVPR workshop slides about the 1.5 decades he spent ~solving object detection a….
0
136
0
@tambetm
Tambet Matiisen
1 year
RT @jimjimson_: Visual demonstration of the universal function approximation theorem
0
6
0
@tambetm
Tambet Matiisen
1 year
RT @robmen: Lots of analysis of the xz/liblzma vulnerability. Most skip over the first step of the attack:. 0. The original maintainer burn….
0
1K
0
@tambetm
Tambet Matiisen
1 year
RT @binarybits: Google recently released a new version of Gemini, its supposed ChatGPT-killer. I did a head-to-head comparison and found th….
0
3
0
@tambetm
Tambet Matiisen
2 years
RT @jxnlco: Just like in my own childhood. It’s where no one says “good job” and you just have to wander the earth, trying to make sense of….
0
7
0
@tambetm
Tambet Matiisen
2 years
This seems relatively comprehensive analysis of what Q* could be. The Tree of Thoughts paper was also hinted by Andrej Karpathy in his latest LLM video. I still stand behind my suggestion that Q* may refer to node processing priority queue ordered by Q-value, similar to A*.
@binarybits
Timothy B. Lee
2 years
Here's the Q* explainer you've been waiting for. I cover the research OpenAI has already published, my best guess on what they're working on, and why I think it's unlikely to lead to AGI any time soon.
0
3
6
@tambetm
Tambet Matiisen
2 years
RT @Francis_YAO_: Just discovered, silently, Qualcomm Snapdragon 8gen 3 already supported 10b language model running locally on your smartp….
0
129
0
@tambetm
Tambet Matiisen
2 years
Anyway, I wouldn't consider it AGI, as it is still missing long-term memory, continual learning, not even mentioning motor control or embodiment. Happy to be proven wrong :).
1
0
9
@tambetm
Tambet Matiisen
2 years
At the same it shifts computation from training to inference, so the text generation will be much more expensive. It's quite possible also that it is used only during training to generate high-quality training data.
1
0
7
@tambetm
Tambet Matiisen
2 years
I would be surprised if it totally changes the capabilities of language models, as they still try to imitate or please humans (pre-training and RLHF stages). But it is plausible that it can provide quite a boost in benchmarks that require reasoning.
1
0
5
@tambetm
Tambet Matiisen
2 years
While both MCTS and beam search do something similar, I guess Q* does it more optimally, allowing it to look through more interesting variations of the sentence. Altogether this provides the much-needed internal consideration or System 2 to the language models.
1
0
5
@tambetm
Tambet Matiisen
2 years
The objective for Q* is to generate a sentence with the highest reward, as opposed to A* that aims to produce a trajectory with the lowest distance. Q* maximizes, A* minimizes, but this is a minor detail.
1
0
3