Roy Schwartz Profile
Roy Schwartz

@royschwartzNLP

Followers
3K
Following
434
Media
20
Statuses
250

Senior Lecturer at @CseHuji. #NLPROC

Joined February 2016
Don't wanna be here? Send us removal request.
@royschwartzNLP
Roy Schwartz
6 years
The focus on SOTA has caused a dramatic increase in the cost of AI, leading to environmental tolls and inclusiveness issues. We advocate research on efficiency in addition to accuracy (#greenai). Work w/ @JesseDodge @nlpnoah and @etzioni at @allen_ai https://t.co/ZHIFMwxnZ8
0
46
155
@nlphuji
HUJI NLP
15 days
We're proud of our team's 11 papers accepted to #EMNLP2025! See you next week in Suzhouโœˆ๏ธ
0
11
15
@TamerGhattas911
Tamer
3 months
Excited to share: our paper โ€œOn Pruning State-Space LLMsโ€ was accepted to EMNLP 2025! ๐ŸŽ‰ Preprint: https://t.co/8TD56aroDc Code: https://t.co/ofi9ZxPDOT Model: Smol-Mamba-1.9B โ†’ https://t.co/AIq2XOn0dS w/ @MichaelHassid & @royschwartzNLP (HUJI) #Mamba #ModelCompression
2
4
14
@yairbrill
Yair Brill
5 months
ืœืคื ื™ ื—ื•ื“ืฉ ืฉืœื— ืœื™ ืืจื™ ืจืคื•ืคื•ืจื˜, ื‘ืŸ ื“ื•ื“ ืฉืœ ืืžื ืฉืœื™, ืžื™ื™ืœ ืžืคืชื™ืข: "ืื™ื ื ื™ ื™ื•ื“ืข ืื ืืชื” ืžื•ื“ืข ืœืžื—ืœื” ืฉืœื™ ื•ืœื”ื™ืฉื’ื™ ื”ืžื“ืขื™ื™ื", ื”ื•ื ืคืชื—, "ืื•ื‘ื—ื ืชื™ ืขื ืกืจื˜ืŸ ืจื™ืื•ืช ืžืกื•ื’ ืชืื™ื ืงื˜ื ื™ื, ืื—ื“ ื”ืงื˜ืœื ื™ื™ื ืฉื™ืฉ. ื ื•ืชืจื• ืœื™ ืขื•ื“ ื›ืžื” ื—ื•ื“ืฉื™ื... ืื ื™ ื›ื•ืชื‘ ืืœื™ืš ืœื‘ืงืฉ ื›ืชื‘ื” ืžื“ืขื™ืช ื‘ืžื•ืกืฃ ื”ืืจืฅ - ื›ื–ื• ืชืขื ื™ื™ืŸ ืœืœื ืกืคืง ืื ืฉื™ื ืจื‘ื™ื"
49
53
1K
@MichaelHassid
Michael Hassid
6 months
The longer reasoning LLM thinks - the more likely to be correct, right? Apparently not. Presenting our paper: โ€œDonโ€™t Overthink it. Preferring Shorter Thinking Chains for Improved LLM Reasoningโ€. Link: https://t.co/Zsp3BD0TU5 1/n
7
37
114
@GKaplan38844
Guy Kaplan
7 months
Heading to @iclr_conf โœˆ๏ธ๐Ÿงฉ โ€˜Tokensโ†’Wordsโ€™ shows how LLMs build fullโ€‘word representations from subโ€‘word tokens and offers a tool for vocab expansion. ๐Ÿš€ See our #ICLR2025 poster โ€‘ 26.4, 15:00โ€‘17:30. ๐Ÿ“„ https://t.co/yXvRvjjr0E ๐Ÿ”— https://t.co/mTBlktKerQ ๐Ÿ‘‡
@GKaplan38844
Guy Kaplan
1 year
๐Ÿ“ขPaper release๐Ÿ“ข : ๐Ÿ” Ever wondered how LLMs understand words when all they see are tokens? ๐Ÿง  Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. https://t.co/Ur9eBn8yBO (preprint) ๐Ÿ‘€ ๐Ÿ‘‡ [1/7]
0
6
40
@GKaplan38844
Guy Kaplan
7 months
โœจ Ever tried generating an image from a prompt but ended up with unexpected outputs? Check out our new paper #FollowTheFlow - tackling T2I issues like bias, failed binding, and leakage from the textual encoding side! ๐Ÿ’ผ๐Ÿ” https://t.co/jTNgec28hw https://t.co/orB0Y7iW1S ๐Ÿงต[1/7]
2
19
61
@TamerGhattas911
Tamer
9 months
๐Ÿš€ New Paper Drop! ๐Ÿš€ โ€œOn Pruning SSM LLMsโ€ โ€“ We check the prunability of MAMBA๐Ÿ based LLMs. We also release Smol2-Mamba-1.9B, a MAMBA based LLM distilled from Smol2-1.7B on ๐Ÿค—: [ https://t.co/AIq2XOny3q] ๐Ÿ“– Read more: [ https://t.co/8TD56arWsK] @royschwartzNLP @MichaelHassid
0
3
10
@royschwartzNLP
Roy Schwartz
1 year
Looking for emergency reviewers for October ARR. If someone can complete a review *today* (Sunday, Nov. 24), please DM me๐Ÿ™ I have papers on efficiency, interpretability and speech
0
1
3
@royschwartzNLP
Roy Schwartz
1 year
Giving #Bluesky a shot. Same handle. Hope to see you there!
0
0
2
@Amit_BenArtzy
Amit Ben-Artzy
1 year
In which layers does information flow from previous tokens to the current token? Presenting our new @BlackboxNLP paper: โ€œAttend First, Consolidate Later: On the Importance of Attention in Different LLM Layersโ€ https://t.co/aNO7fKxXix 1/n
1
20
69
@TamarKolodny
Tamar Kolodny
1 year
It's been difficult to share good news from this part of the world. But it's long overdue - I am excited to share that I joined the Psychology Dept at Ben-Gurion University & Azrieli National Centre for Autism and Neurodev. ! Hooray for new endeavors and in hopes of better times.
5
2
59
@annargrs
Anna Rogers
1 year
๐Ÿ“ข๐Ÿ“ข Dear #NLProc people with strong opinions on peer review & ARR in particular: this is the ACL survey you've been waiting for. It covers core design of ARR, incl. the decoupling of acceptance reviews & decisions and length of review cycles. Don't say you were not asked! /1
@aclmeeting
ACL 2025
1 year
What should the ACL peer review process be like in the future? Please cast your views in this survey: https://t.co/fBGWIwXRCo by 4th Nov 2024 #NLProc @ReviewAcl
2
13
53
@aclmeeting
ACL 2025
1 year
What should the ACL peer review process be like in the future? Please cast your views in this survey: https://t.co/fBGWIwXRCo by 4th Nov 2024 #NLProc @ReviewAcl
4
37
56
@GKaplan38844
Guy Kaplan
1 year
๐Ÿ“ขPaper release๐Ÿ“ข : ๐Ÿ” Ever wondered how LLMs understand words when all they see are tokens? ๐Ÿง  Our latest study uncovers how LLMs reconstruct full words from sub-word tokens, even when misspelled or previously unseen. https://t.co/Ur9eBn8yBO (preprint) ๐Ÿ‘€ ๐Ÿ‘‡ [1/7]
5
22
54
@y0b1byte
yobibyte
1 year
Interesting work
3
46
404
@MichaelHassid
Michael Hassid
1 year
"Transformers are Multi-State RNNs", and our KV compression policy "TOVA", got accepted to #EMNLP2024! ๐ŸŽ‰ See you in Miami! :) Paper:
Tweet card summary image
arxiv.org
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only...
@MichaelHassid
Michael Hassid
2 years
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โ€œTransformers are Multi-State RNNsโ€ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
1
5
21
@MichaelHassid
Michael Hassid
1 year
Which is better, running a 70B model once, or a 7B model 10 times? The answer might be surprising! Presenting our new @COLM_conf paper: "The Larger the Better? Improved LLM Code-Generation via Budget Reallocation" https://t.co/Zayq02RFJJ 1/n
6
43
209
@MichaelHassid
Michael Hassid
1 year
New version for โ€œTransformers are Multi-State RNNsโ€ is now on arxiv: https://t.co/mmPogD56UO Whatโ€™s new? Efficiency analysis of TOVA (our KV compression policy) Extrapolation with TOVA Details below >> 1/3
Tweet card summary image
arxiv.org
Transformers are considered conceptually different from the previous generation of state-of-the-art NLP models - recurrent neural networks (RNNs). In this work, we demonstrate that decoder-only...
@MichaelHassid
Michael Hassid
2 years
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โ€œTransformers are Multi-State RNNsโ€ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
1
4
16
@UKPLab
UKP Lab
2 years
Stop complaining about the bad review quality. Join forces and start research on #NLProc for #PeerReview! ๐Ÿšจ A new white paper by over 20 top AI and NLP researchers provides a thorough discussion of AI assistance for scientific quality control. (1/๐Ÿงต) ๐Ÿ“‘ https://t.co/KqXcFDY5N6
3
26
96
@MichaelHassid
Michael Hassid
2 years
Transformers outperform RNNs as they operate differently. Do they? Excited to share our new paper: โ€œTransformers are Multi-State RNNsโ€ Paper: https://t.co/vjZ8ba1Iaw Code: https://t.co/TJyVlxmqst 1/n
2
35
119