Shmulik Amar Profile
Shmulik Amar

@pyshmulik

Followers
30
Following
1K
Media
8
Statuses
174

Building things

Israel
Joined September 2018
Don't wanna be here? Send us removal request.
@pyshmulik
Shmulik Amar
11 days
RT @ENachshoni: 🚨 New paper out! šŸ“„.What happens when LLMs & RLMs face conflicting answers to a question? šŸ¤”.They often ignore disagreement a….
0
7
0
@pyshmulik
Shmulik Amar
19 days
RT @mosh_levy: Producing reasoning texts boosts the capabilities of AI models, but do we humans correctly understand these texts? Our lates….
0
27
0
@grok
Grok
7 days
Join millions who have switched to Grok.
269
538
4K
@pyshmulik
Shmulik Amar
1 month
RT @AviyaMaimon: 🚨 New paper alert! 🚨.We propose an IQ Test for LLMs — a new way to evaluate models that goes beyond benchmarks and uncover….
0
14
0
@pyshmulik
Shmulik Amar
1 month
RT @AviyaMaimon: We release:.āœ… Code.āœ… Leaderboard.āœ… Skill matrices & tools. Let’s shift to skill‑based evaluation for LLMs!.Full paper here….
arxiv.org
Current evaluations of large language models (LLMs) rely on benchmark scores, but it is difficult to interpret what these individual scores reveal about a model's overall skills. Specifically, as...
0
3
0
@pyshmulik
Shmulik Amar
1 month
0
4
0
@pyshmulik
Shmulik Amar
1 month
@biunlp Check out the paper, demo and code for more details. Collab w/ @obspp18 @lovodkin93 Ido Dagan @biunlp. Paper: # @huggingface: Demo: .Code:
Tweet card summary image
github.com
Instruction‑Guided Content Selection with LLMs - toolkit and datasets - shmuelamar/igcs
0
1
4
@pyshmulik
Shmulik Amar
1 month
@biunlp We invite researchers to use our benchmark (IGCS-Bench), our generic transfer-learning dataset (GenCS) and our trained SLMs to advance LLM capabilities in extractive content selection!. (5/n).
0
0
1
@pyshmulik
Shmulik Amar
1 month
@biunlp Key finding 2ļøāƒ£: For tasks requiring longer selections, LLMs consistently perform better when processing one document at a time instead of the entire set at once. This is not so much the case for tasks with short selections. (4/n)
Tweet media one
1
0
1
@pyshmulik
Shmulik Amar
1 month
@biunlp Key finding 1ļøāƒ£: Training with a diverse mix of content selection tasks helps boost LLM performance even on new extractive tasks. Generic transfer learning at its best!. (3/n)
Tweet media one
1
0
1
@pyshmulik
Shmulik Amar
1 month
@biunlp Motivation: Many NLP tasks require selecting relevant text spans from given source texts. Despite this shared objective, such content selection tasks have traditionally been studied in isolation, each with its own modeling approaches, datasets, and evaluations. (2/n)
Tweet media one
1
0
1
@pyshmulik
Shmulik Amar
1 month
🚨 Introducing IGCS, accepted to #TACL!. Instruction Guided Content Selection (IGCS) unifies many tasks such as extractive summarization, evidence retrieval and argument mining under one scheme for selecting extractive spans in given sources. @biunlp.(1/n)
Tweet media one
2
7
29
@pyshmulik
Shmulik Amar
3 months
RT @ArieCattan: 🚨 RAG is a popular approach but what happens when the retrieved sources provide conflicting information?šŸ¤”. We're excited to….
0
14
0
@pyshmulik
Shmulik Amar
3 months
RT @oriern1: 🧵 New paper at Findings #ACL2025 @aclmeeting!.Not all documents are processed equally well. Some consistently yield poor resul….
0
12
0
@pyshmulik
Shmulik Amar
3 months
RT @lovodkin93: Check out our new paper on highly localized attributions, both in the input and the output!.
0
7
0
@pyshmulik
Shmulik Amar
3 months
RT @hirscheran: 🚨 Introducing LAQuer, accepted to #ACL2025 (main conf)!. LAQuer provides more granular attribution for LLM generations: use….
0
32
0
@pyshmulik
Shmulik Amar
4 months
RT @AlonEirew: Excited to present our system demonstration paper on EventFull — an Event-Event Relation annotation tool — at #NAACL25. Come….
0
6
0
@pyshmulik
Shmulik Amar
4 months
RT @_akhaliq: RefVNLI. Towards Scalable Evaluation of Subject-driven Text-to-image Generation
Tweet media one
0
52
0
@pyshmulik
Shmulik Amar
5 months
Tweet media one
0
0
0
@pyshmulik
Shmulik Amar
6 months
RT @ShirAshuryTahan: LLMs struggle with tables—but how robust are they really?.šŸ” ToRR goes beyond accuracy, testing real-world robustness a….
0
13
0
@pyshmulik
Shmulik Amar
6 months
RT @litalby: šŸŽ‰ I'm happy to share that our paper, Make It Count, has been accepted to #CVPR2025!.A huge thanks to my amazing collaborators….
0
19
0