rasoolfa Profile Banner
Rasool Fakoor Profile
Rasool Fakoor

@rasoolfa

Followers
395
Following
2K
Media
4
Statuses
601

Research in RL & ML.

Joined December 2012
Don't wanna be here? Send us removal request.
@rasoolfa
Rasool Fakoor
2 years
Interested in continual learning with IL, adapting to ever changing data in RL/SL, & at #ICLR2024? Then swing by to our posters at Halle B & say hi: Tue 4:30-6:30, poster #223 https://t.co/2ZKUul76mw Wed 4:30-6:30, poster #155 https://t.co/LV2xw6DWE8
1
2
18
@twni2016
Tianwei Ni
1 month
This work was recently accepted by TMLR! https://t.co/ja623Ov13Z Besides our main contributions in our previous post, below are our additional insights in this TMLR version when applying preference-based and unlearning-based methods to LLM math reasoning:
openreview.net
Leveraging inference-time search in large language models has proven effective in further enhancing a trained model's capability to solve complex mathematical and reasoning problems. However, this...
@twni2016
Tianwei Ni
8 months
Can we make LLMs reason effectively without a huge inference time cost? We show a powerful approach through learning and forgetting! Our recipe: 1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),
1
3
5
@rasoolfa
Rasool Fakoor
4 months
The application closes on Tuesday (8/12). If you are interested, please apply and don't wait until the last minute.
@rasoolfa
Rasool Fakoor
4 months
Our team is *hiring* interns & researchers! We’re a small team of hardcore researchers & engineers working on foundation models, agentic methods, and embodiment. If you have strong publications and related experience, plz fill out application form. https://t.co/U4gOvNQ9qR
0
0
0
@rasoolfa
Rasool Fakoor
4 months
Our team is *hiring* interns & researchers! We’re a small team of hardcore researchers & engineers working on foundation models, agentic methods, and embodiment. If you have strong publications and related experience, plz fill out application form. https://t.co/U4gOvNQ9qR
1
3
14
@twni2016
Tianwei Ni
8 months
Can we make LLMs reason effectively without a huge inference time cost? We show a powerful approach through learning and forgetting! Our recipe: 1️⃣ Aggregate reasoning paths from diverse sources: Chain-of-Thought, inference-time search (Tree-of-Thought, Reasoning-via-Planning),
0
6
23
@EmpathYang
Ke Yang
11 months
Excited to announce that our web agent paper, AgentOccam, has been accepted to ICLR 2025! 🏂🏂🏂 Huge thanks to all collaborators! 😊 Special thanks to my brilliant and considerate mentor, Yao @yaoliucs, for your constant guidance and encouragement! Sapana @Sapana_007 and Rasool
@EmpathYang
Ke Yang
1 year
👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄 🧙 Link: https://t.co/s6GPYFAEFf
0
6
16
@EmpathYang
Ke Yang
1 year
👾 Introducing AgentOccam: Automating Web Tasks with LLMs! 🌐 AgentOccam showcases the impressive power of Large Language Models (LLMs) on web tasks, without any in-context examples, new agent roles, online feedback, or search strategies. 🏄🏄🏄 🧙 Link: https://t.co/s6GPYFAEFf
3
28
60
@Jesse_Y_Zhang
Jesse Zhang
1 year
How can robots efficiently learn **new tasks/in new settings**? Introducing EXTRACT: a reinforcement learning (RL) framework that extracts a discrete + continuously parameterized skill library from offline data for efficient RL on new tasks! Accepted to CoRL 2024: 🧵👇
5
37
129
@smolix
Alex Smola
1 year
Proud to release the first LLM from @boson_ai. Higgs-Llama-3-70B, built for characters and gameplay, trained on Boson-3 base. With great MMLU-Pro performance.
1
9
50
@rasoolfa
Rasool Fakoor
2 years
Our team at AWS is *hiring* interns and full-time researchers! @yaoliucs, @pratikac, I, and others work on RL, alignment, large models, and ML in general. If you have a strong relevant publications in those areas, please fill out this form. https://t.co/al05f0w14d
docs.google.com
Read this first: Our team at AWS is actively looking for candidates with strong backgrounds in RL, RLHF, large language/multi/uni-modal models, and machine learning in general. We look for *both*...
0
4
24
@yaoliucs
Yao Liu
2 years
Offline RL is much harder than online RL or imitation learning as it needs to solve a sequence of counterfactual reasoning problems. That often gives an error of (1+\delta)^H, where delta is the one-step divergence of policy or extrapolation of Q and H is the horizon. 1/N
1
2
24
@yaoliucs
Yao Liu
2 years
One common misconception about (deep) RL is that is was done by first defining some empirical loss as objective and then deriving model updating rules from GD, just like supervised learning. This is NOT the case for popular RL algorithms like policy gradient or TD-based. 1/N
1
2
13
@rasoolfa
Rasool Fakoor
2 years
And finally, if you look for an internship about RL, large models, alignment, etc. Send a message to me, @AsadiKavosh, or @yaoliucs #NeurIPS2023. See you next week. 6/6
0
0
1
@rasoolfa
Rasool Fakoor
2 years
TAIL: Task-specific Adapters for Imitation Learning with Large Pretrained Models FMDM workshop, Hall E2 (level 1). Fri 15 Dec, 8:15 a.m. CST - 4 PM CST https://t.co/VBvm2eEwDG joint work Zuxin Liu, @Jesse_Y_Zhang, @AsadiKavosh @yaoliucs, Shoham 5/n
0
3
3
@rasoolfa
Rasool Fakoor
2 years
Resetting the Optimizer in Deep RL: An Empirical Study Great Hall & Hall B1+B2 (level 1) #1410 Tue 12 Dec 5:15 p.m. CST — 7:15 p.m. CST https://t.co/YEQOgyYUSm joint work @AsadiKavosh, Shoham 4/n
0
0
0
@rasoolfa
Rasool Fakoor
2 years
Budgeting Counterfactual for Offline RL Great Hall & Hall B1+B2 (level 1) #1403 Tue 12 Dec 5:15 p.m. CST — 7:15 p.m. CST https://t.co/gscDnO0UTk joint work with @yaoliucs @pratikac. 3/n
0
1
2
@rasoolfa
Rasool Fakoor
2 years
TD Convergence: An Optimization Perspective, Great Hall & Hall B1+B2 (level 1) #1503 Wed 13 Dec 5 p.m. CST — 7 p.m. CST #NeurIPS2023 https://t.co/iEtXJZj3uU joint work @AsadiKavosh, Shoham, @yaoliucs. 2/n
0
1
1