Fahim Tajwar Profile
Fahim Tajwar

@FahimTajwar10

Followers
628
Following
640
Media
21
Statuses
112

PhD Student @mldcmu @SCSatCMU BS/MS from @Stanford

Joined April 2021
Don't wanna be here? Send us removal request.
@FahimTajwar10
Fahim Tajwar
2 months
RL with verifiable reward has shown impressive results in improving LLM reasoning, but what can we do when we do not have ground truth answers?. Introducing Self-Rewarding Training (SRT): where language models provide their own reward for RL training!. 🧵 1/n
Tweet media one
21
143
835
@FahimTajwar10
Fahim Tajwar
22 days
Please checkout Gaurav's insanely cool work on memorization, if you are at ICML!.
@gaurav_ghosal
Gaurav Ghosal
22 days
1/So much of privacy research is designing post-hoc methods to make models mem. free. It’s time we turn that around with architectural changes. Excited to add Memorization Sinks to the transformer architecture this #ICML2025 to isolate memorization during LLM training🧵
Tweet media one
0
0
13
@FahimTajwar10
Fahim Tajwar
24 days
RT @g_k_swamy: Recent work has seemed somewhat magical: how can RL with *random* rewards make LLMs reason? We pull back the curtain on thes….
0
71
0
@FahimTajwar10
Fahim Tajwar
26 days
RT @AlexRobey23: On Monday, I'll be presenting a tutorial on jailbreaking LLMs + the security of AI agents with @HamedSHassani and @aminkar….
0
9
0
@FahimTajwar10
Fahim Tajwar
26 days
RT @yidingjiang: @abitha___ will be presenting our work on training language models to predict further into the future beyond the next tok….
0
5
0
@FahimTajwar10
Fahim Tajwar
26 days
RT @yidingjiang: I will be at ICML next week. If you are interested in chatting about anything related to generalization, exploration, and….
0
9
0
@FahimTajwar10
Fahim Tajwar
26 days
Please attend @yidingjiang 's oral presentation of our work, Paprika, at ICML!.
@yidingjiang
Yiding Jiang
26 days
I will talk about how to train agents with decision making capabilities that generalize to completely new environments:.
0
2
23
@FahimTajwar10
Fahim Tajwar
28 days
RT @sukjun_hwang: Tokenization has been the final barrier to truly end-to-end language models. We developed the H-Net: a hierarchical netw….
0
728
0
@FahimTajwar10
Fahim Tajwar
1 month
RT @yidingjiang: A mental model I find useful: all data acquisition (web scrapes, synthetic data, RL rollouts, etc.) is really an explorati….
yidingjiang.github.io
This post explores the idea that the next breakthroughs in AI may hinge more on how we collect experience through exploration, and less on how many parameters and data points we have.
0
58
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @allenainie: Decision-making with LLM can be studied with RL! Can an agent solve a task with text feedback (OS terminal, compiler, a per….
0
25
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @mihdalal: Incredibly excited to share that Neural MP got accepted to IROS as an Oral presentation!! Huge congrats to the whole team (@J….
0
9
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @g_k_swamy: Say ahoy to šš‚š™°š™øš™»š™¾ššā›µ: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to r….
0
74
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @g_k_swamy: In my experience, the details of RLHF matter a shocking amount. If you'd like to avoid solving a hard exploration problem, t….
0
4
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @askalphaxiv: "Can Large Reasoning Models Self-Train?". A brilliant paper from CMU showing LLMs can improve at math reasoning WITHOUT hu….
0
79
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @mihdalal: This is really great work by Fahim and co, moving out of the regime where we have ground truth rewards is critical for the ne….
0
5
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @shafayat_sheikh: Check out our latest work on self-improving LLMs, where we try to see if LLMs can utilize their internal self consiste….
0
24
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @askalphaxiv: This is pretty remarkable – AI systems learning to self-improve. We're seeing a wave of research where AI isn't just learn….
0
131
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @gaurav_ghosal: While LLMs contain extensive factual knowledge, they are also unreliable when answering questions downstream. In our #IC….
0
35
0
@FahimTajwar10
Fahim Tajwar
2 months
RT @IntologyAI: The 1st fully AI-generated scientific discovery to pass the highest level of peer review – the main track of an A* conferen….
0
134
0
@FahimTajwar10
Fahim Tajwar
2 months
19/ .This was an awesome collaboration with @shafayat_sheikh, my amazing advisors @rsalakhu and Jeff Schneider, and @Zanette_ai at @CarnegieMellon. I learned a lot throughout the project, and we appreciate any feedback! . Paper + code + datasets:
1
0
8
@FahimTajwar10
Fahim Tajwar
2 months
RT @mihirp98: Excited to share our work: Maximizing Confidence Alone Improves Reasoning. Humans rely on confidence to learn when answer key….
0
35
0