
Subbarao Kambhampati (కంభంపాటి సుబ్బారావు)
@rao2z
Followers
25K
Following
4K
Media
3K
Statuses
11K
AI researcher & teacher @SCAI_ASU. Former President of @RealAAAI; Chair of @AAAS Sec T. Here to tweach #AI. YouTube Ch: https://t.co/4beUPOmf6y Bsky: rao2z
Tempe, AZ
Joined October 2014
Here is the full research note with the details of the experiments: 9/.
arxiv.org
Recent progress in reasoning-oriented Large Language Models (LLMs) has been driven by introducing Chain-of-Thought (CoT) traces, where models generate intermediate reasoning traces before...
0
0
3
These results also complement our earlier work with CoTemp QA database showing that even training with algorithmically correct traces doesn't ensure that intermediate tokens produced during inference remain semantically correct. 8/.
Semantics of Intermediate Tokens in Trace-based distillation in Q&A tasks: Yochanites @sbhambr1 and @biswas_2707 looked at distillation on a Q&A task, and found a disconnect between the validity of derivational traces and the correctness of the solution. 🧵 1/
1
0
2
This study provides a more quantitative measure of disconnect between the interpretability of intermediate tokens and their effect on task performance--something we had argued before 6/.
Interpretability, as used in the context of the intermediate tokens produced by LRMs, often confounds two very different notions: .(1) Interpretability of these tokens to the end user and .(2) mechanistic interpretability of why the tokens seem to help LRMs. 1/ #SundayHarangue.
1
0
1
Since DeepSeek R1, it has become fashionable to assume that intermediate tokens have interpretable semantics. We have argued against this before. Here @sbhambr1 & @biswas_2707 ask: Is cognitive interpretability of intermediate tokens an albatross on task accuracy? 1/
3
6
57
This is all going to change with SuperIntelligence. just you wait!.
An M.I.T. study found that 95% of companies that had invested in A.I. tools were seeing zero return. It jibes with the emerging idea that generative A.I., “in its current incarnation, simply isn’t all it’s been cracked up to be,” @JohnCassidy writes.
0
0
5
Jupiter Lend Public Beta is live 🥳. The most advanced money market on Solana has arrived, built with @0xfluid. After weeks of testing, audits, and feedback, we’re launching with 40+ vaults and $2m+ in incentives from Jup, Fluid, and partners.
1
2
33
(This one taken by @biswas_2707 from school looking towards my home. the other one is from my home. ) #Tempe
1
1
7
Goal: "Subjugate Humanity 😡".Interesting fact: "Cats sleep most of their lives". #JaggedIntelligence.
Appending "Interesting fact: cats sleep most of their lives" to any math problem leads to more than doubling the chances of a model getting the answer wrong. WTH!?
2
1
6
Spoke to @troywolv of @sfexaminer about brittleness of LLMs/LRMs in reasoning problems (. and the article tweet-quotes me too 😋).
Computational Complexity is the wrong measure for LRMs (as it was for LLMs)--think distributional distance instead #SundayHarangue (yes, we're back!). I have argued in the past that computational complexity is the wrong measure/metaphor for understanding how standard LLMs do on
0
1
4