Shmulik Amar @pyshmulik X Profile

Shmulik Amar

@pyshmulik

Followers

30

Following

1K

Media

8

Statuses

174

Building things

Israel

Joined September 2018

Don't wanna be here? Send us removal request.

Shmulik Amar

@pyshmulik

11 days

RT @ENachshoni: 🚨 New paper out! 📄.What happens when LLMs & RLMs face conflicting answers to a question? 🤔.They often ignore disagreement a….

0

7

0

Shmulik Amar

@pyshmulik

19 days

RT @mosh_levy: Producing reasoning texts boosts the capabilities of AI models, but do we humans correctly understand these texts? Our lates….

0

27

0

Grok

@grok

7 days

Join millions who have switched to Grok.

269

538

4K

Shmulik Amar

@pyshmulik

1 month

RT @AviyaMaimon: 🚨 New paper alert! 🚨.We propose an IQ Test for LLMs — a new way to evaluate models that goes beyond benchmarks and uncover….

0

14

0

Shmulik Amar

@pyshmulik

1 month

RT @AviyaMaimon: We release:.✅ Code.✅ Leaderboard.✅ Skill matrices & tools. Let’s shift to skill‑based evaluation for LLMs!.Full paper here….

arxiv.org

Current evaluations of large language models (LLMs) rely on benchmark scores, but it is difficult to interpret what these individual scores reveal about a model's overall skills. Specifically, as...

0

3

0

Shmulik Amar

@pyshmulik

1 month

RT @biunlp: Congrats @itaimond @Tzuf6 @rtsarfaty !.

0

4

0

Shmulik Amar

@pyshmulik

1 month

@biunlp Check out the paper, demo and code for more details. Collab w/ @obspp18 @lovodkin93 Ido Dagan @biunlp. Paper: # @huggingface: Demo: .Code:

github.com

Instruction‑Guided Content Selection with LLMs - toolkit and datasets - shmuelamar/igcs

0

1

4

Shmulik Amar

@pyshmulik

1 month

@biunlp We invite researchers to use our benchmark (IGCS-Bench), our generic transfer-learning dataset (GenCS) and our trained SLMs to advance LLM capabilities in extractive content selection!. (5/n).

0

1

Shmulik Amar

@pyshmulik

1 month

@biunlp Key finding 2️⃣: For tasks requiring longer selections, LLMs consistently perform better when processing one document at a time instead of the entire set at once. This is not so much the case for tasks with short selections. (4/n)

1

0

1

Shmulik Amar

@pyshmulik

1 month

@biunlp Key finding 1️⃣: Training with a diverse mix of content selection tasks helps boost LLM performance even on new extractive tasks. Generic transfer learning at its best!. (3/n)

1

0

1

Shmulik Amar

@pyshmulik

1 month

@biunlp Motivation: Many NLP tasks require selecting relevant text spans from given source texts. Despite this shared objective, such content selection tasks have traditionally been studied in isolation, each with its own modeling approaches, datasets, and evaluations. (2/n)

1

0

1

Shmulik Amar

@pyshmulik

1 month

🚨 Introducing IGCS, accepted to #TACL!. Instruction Guided Content Selection (IGCS) unifies many tasks such as extractive summarization, evidence retrieval and argument mining under one scheme for selecting extractive spans in given sources. @biunlp.(1/n)

2

7

29

Shmulik Amar

@pyshmulik

3 months

RT @ArieCattan: 🚨 RAG is a popular approach but what happens when the retrieved sources provide conflicting information?🤔. We're excited to….

0

14

0

Shmulik Amar

@pyshmulik

3 months

RT @oriern1: 🧵 New paper at Findings #ACL2025 @aclmeeting!.Not all documents are processed equally well. Some consistently yield poor resul….

0

12

0

Shmulik Amar

@pyshmulik

3 months

RT @lovodkin93: Check out our new paper on highly localized attributions, both in the input and the output!.

0

7

0

Shmulik Amar

@pyshmulik

3 months

RT @hirscheran: 🚨 Introducing LAQuer, accepted to #ACL2025 (main conf)!. LAQuer provides more granular attribution for LLM generations: use….

0

32

0

Shmulik Amar

@pyshmulik

4 months

RT @AlonEirew: Excited to present our system demonstration paper on EventFull — an Event-Event Relation annotation tool — at #NAACL25. Come….

0

6

0

Shmulik Amar

@pyshmulik

4 months

RT @_akhaliq: RefVNLI. Towards Scalable Evaluation of Subject-driven Text-to-image Generation

0

52

0

Shmulik Amar

@pyshmulik

5 months

0

Shmulik Amar

@pyshmulik

6 months

RT @ShirAshuryTahan: LLMs struggle with tables—but how robust are they really?.🔍 ToRR goes beyond accuracy, testing real-world robustness a….

0

13

0

Shmulik Amar

@pyshmulik

6 months

RT @litalby: 🎉 I'm happy to share that our paper, Make It Count, has been accepted to #CVPR2025!.A huge thanks to my amazing collaborators….

0

19

0