Roy Schwartz
@royschwartzNLP
Followers
3K
Following
434
Media
20
Statuses
250
Senior Lecturer at @CseHuji. #NLPROC
Joined February 2016
The focus on SOTA has caused a dramatic increase in the cost of AI, leading to environmental tolls and inclusiveness issues. We advocate research on efficiency in addition to accuracy (#greenai). Work w/ @JesseDodge @nlpnoah and @etzioni at @allen_ai
https://t.co/ZHIFMwxnZ8
0
46
155
We're proud of our team's 11 papers accepted to #EMNLP2025! See you next week in Suzhouโ๏ธ
0
11
15
Excited to share: our paper โOn Pruning State-Space LLMsโ was accepted to EMNLP 2025! ๐ Preprint: https://t.co/8TD56aroDc Code: https://t.co/ofi9ZxPDOT Model: Smol-Mamba-1.9B โ https://t.co/AIq2XOn0dS w/ @MichaelHassid & @royschwartzNLP (HUJI) #Mamba #ModelCompression
2
4
14
ืืคื ื ืืืืฉ ืฉืื ืื ืืจื ืจืคืืคืืจื, ืื ืืื ืฉื ืืื ืฉืื, ืืืื ืืคืชืืข: "ืืื ื ื ืืืืข ืื ืืชื ืืืืข ืืืืื ืฉืื ืืืืืฉืื ืืืืขืืื", ืืื ืคืชื, "ืืืืื ืชื ืขื ืกืจืื ืจืืืืช ืืกืื ืชืืื ืงืื ืื, ืืื ืืงืืื ืืื ืฉืืฉ. ื ืืชืจื ืื ืขืื ืืื ืืืืฉืื... ืื ื ืืืชื ืืืื ืืืงืฉ ืืชืื ืืืขืืช ืืืืกืฃ ืืืจืฅ - ืืื ืชืขื ืืื ืืื ืกืคืง ืื ืฉืื ืจืืื"
49
53
1K
The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: โDonโt Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoningโ. Link: https://t.co/Zsp3BD0TU5 1/n
7
37
114
Heading to @iclr_conf โ๏ธ๐งฉ โTokensโWordsโ shows how LLMs build fullโword representations from subโword tokens and offers a tool for vocab expansion. ๐ See our #ICLR2025 poster โ 26.4, 15:00โ17:30. ๐ https://t.co/yXvRvjjr0E ๐ https://t.co/mTBlktKerQ ๐
๐ขPaper release๐ข : ๐ Ever wondered how LLMs understand words when all they see are tokens? ๐ง Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. https://t.co/Ur9eBn8yBO (preprint) ๐ ๐ [1/7]
0
6
40
โจ Ever tried generating an image from a prompt but ended up with unexpected outputs? Check out our new paper #FollowTheFlow - tackling T2I issues like bias, failed binding, and leakage from the textual encoding side! ๐ผ๐ https://t.co/jTNgec28hw
https://t.co/orB0Y7iW1S ๐งต[1/7]
2
19
61
๐ New Paper Drop! ๐ โOn Pruning SSM LLMsโ โ We check the prunability of MAMBA๐ based LLMs. We also release Smol2-Mamba-1.9B, a MAMBA based LLM distilled from Smol2-1.7B on ๐ค: [ https://t.co/AIq2XOny3q] ๐ Read more: [ https://t.co/8TD56arWsK]
@royschwartzNLP @MichaelHassid
0
3
10
Looking for emergency reviewers for October ARR. If someone can complete a review *today* (Sunday, Nov. 24), please DM me๐ I have papers on efficiency, interpretability and speech
0
1
3
Giving #Bluesky a shot. Same handle. Hope to see you there!
0
0
2
In which layers does information flow from previous tokens to the current token? Presenting our new @BlackboxNLP paper: โAttend First, Consolidate Later: On the Importance of Attention in Different LLM Layersโ https://t.co/aNO7fKxXix 1/n
1
20
69
It's been difficult to share good news from this part of the world. But it's long overdue - I am excited to share that I joined the Psychology Dept at Ben-Gurion University & Azrieli National Centre for Autism and Neurodev. ! Hooray for new endeavors and in hopes of better times.
5
2
59
๐ข๐ข Dear #NLProc people with strong opinions on peer review & ARR in particular: this is the ACL survey you've been waiting for. It covers core design of ARR, incl. the decoupling of acceptance reviews & decisions and length of review cycles. Don't say you were not asked! /1
What should the ACL peer review process be like in the future? Please cast your views in this survey: https://t.co/fBGWIwXRCo by 4th Nov 2024 #NLProc @ReviewAcl
2
13
53
What should the ACL peer review process be like in the future? Please cast your views in this survey: https://t.co/fBGWIwXRCo by 4th Nov 2024 #NLProc @ReviewAcl
4
37
56
๐ขPaper release๐ข : ๐ Ever wondered how LLMs understand words when all they see are tokens? ๐ง Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. https://t.co/Ur9eBn8yBO (preprint) ๐ ๐ [1/7]
5
22
54
"Transformers are Multi-State RNNs", and our KV compression policy "TOVA", got accepted to #EMNLP2024! ๐ See you in Miami! :) Paper:
arxiv.org
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only...
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โTransformers are Multi-State RNNsโ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
1
5
21
Which is better, running a 70B model once, or a 7B model 10 times? The answer might be surprising! Presenting our new @COLM_conf paper: "The Larger the Better? Improved LLM Code-Generation via Budget Reallocation" https://t.co/Zayq02RFJJ 1/n
6
43
209
New version for โTransformers are Multi-State RNNsโ is now on arxiv: https://t.co/mmPogD56UO Whatโs new? Efficiency analysis of TOVA (our KV compression policy) Extrapolation with TOVA Details below >> 1/3
arxiv.org
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only...
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โTransformers are Multi-State RNNsโ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
1
4
16
Stop complaining about the bad review quality. Join forces and start research on #NLProc for #PeerReview! ๐จ A new white paper by over 20 top AI and NLP researchers provides a thorough discussion of AI assistance for scientific quality control. (1/๐งต) ๐ https://t.co/KqXcFDY5N6
3
26
96
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โTransformers are Multi-State RNNsโ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
2
35
119