
Vyas Raina
@vyasraina_nlp
Followers
8
Following
9
Media
0
Statuses
6
Joined May 2023
Is computer vision “solved”? Not yet Current models score 0% on ZeroBench 🧵1/6
58
254
3K
Excited to share that our work on LLM Task Switch has been accepted at #EMNLP2024 @emnlpmeeting Main! Check out the paper - https://t.co/6r3GTGE7VV And the GitHub -
github.com
[EMNLP'24] Evaluating LLM performance and sensitivity when there is a "task-switch". Code for "LLM Task Interference: An Initial Study on the Impact of Task-Switc...
1/N 🧵🚀Ask an LLM a maths question, it does well. Ask it the same question after a conversation about sentiment and it fails: we have a problem... Checkout our work on task-switch: https://t.co/6r3GTGE7VV With @AkashGu30808281, @vyasraina_nlp , Mark Gales, @mariojfritz
2
5
36
Excited to share our new work! We explore how Large Language Models (LLMs) can be used to hypothesize missing causal variables in scientific discovery! Our study systematically evaluates hypothesis generation across different tasks and assumptions. 1/n
1
9
46
As LLMs move from research into real-world deployment, it is more important than ever to ensure they operate as desired in situations not typically assessed by benchmarks. This work shows that LLMs are not yet ready to be all-in-one agents. #LLMs #GPT4 #NLP2024
1/N 🧵🚀Ask an LLM a maths question, it does well. Ask it the same question after a conversation about sentiment and it fails: we have a problem... Checkout our work on task-switch: https://t.co/6r3GTGE7VV With @AkashGu30808281, @vyasraina_nlp , Mark Gales, @mariojfritz
0
2
1
1/N 🧵🚀Ask an LLM a maths question, it does well. Ask it the same question after a conversation about sentiment and it fails: we have a problem... Checkout our work on task-switch: https://t.co/6r3GTGE7VV With @AkashGu30808281, @vyasraina_nlp , Mark Gales, @mariojfritz
5
11
42