
Assaf Ben Kish
@abk_tau
Followers
105
Following
276
Media
20
Statuses
73
Deep Learning | Large Language Models | Reinforcement Learning
Joined August 2023
OPRM is accepted to #COLM2025!.See you in Montreal 🇨🇦. Big thanks to our great collaborators from TAU, MIT, and IBM!.#LLM @COLM_conf.
New work! 🚨. Recurrent LLMs like Mamba and RWKV can efficiently process millions of tokens, yet still underperform on real-world long-context tasks. What's holding them back? 🤔.And how can a lightweight fix boost their performance by 35% on LongBench? 👇🏼🧵. Github:.
1
2
14
RT @ItamarZimerman: 📄🚨 New!.Tired of waiting minutes for LLMs to "think"?.Test-time scaling (O3, DeepSeek-R1) lets LLMs reason before answe….
0
19
0
RT @YVinker: Thanks @MIT_CSAIL for featuring our work!🖊️🎨.Huge thanks to the CSAIL news team for the fun article + video!!. We'll be presen….
0
11
0
RT @MIT_CSAIL: Sometimes the best way to express an idea is by sketching it out. A system from MIT CSAIL & Stanford captures this iterativ….
0
33
0
Very nice deep dive explaining OPRM by @xiaolGo .
New work! 🚨. Recurrent LLMs like Mamba and RWKV can efficiently process millions of tokens, yet still underperform on real-world long-context tasks. What's holding them back? 🤔.And how can a lightweight fix boost their performance by 35% on LongBench? 👇🏼🧵. Github:.
0
0
3
This work was a great collaboration with @ItamarZimerman, @jmie_mirza, James Glass, @leokarlin, and @RGiryes. Check out the paper and our github repo for more experiments, details and code!. Arxiv: Github:
0
2
5
RT @IdanShenfeld: The next frontier for AI shouldn’t just be generally helpful. It should be helpful for you!. Our new paper shows how to….
0
28
0
DeciMamba, the first context extension method for Mamba, is accepted to #ICLR2025! 🎉. New revision with more long-context results:. Special thanks to @ItamarZimerman @ShadyAbh @nadavcohen @amirgloberson @liorwolf @RGiryes !.
New Work! 🐍.What prevents Mamba from extrapolating to sequences that are significantly longer than those it was trained on? .Furthermore, can Mamba solve long-range NLP tasks using short-range training only?.🧵🧵🧵
1
7
24