GenBench Profile Banner
GenBench Profile
GenBench

@GenBench

Followers
442
Following
104
Media
72
Statuses
193

State-of-the-art generalisation testing in NLP. Tag us for a RT of your NLP generalisation paper tweet!

Joined April 2022
Don't wanna be here? Send us removal request.
@GenBench
GenBench
2 years
The GenBench workshop is back! Do you work on generalisation (benchmarking) in #NLProc? Submit to the 2nd edition ( https://t.co/XqMMYRW8vQ) co-located with #EMNLP2024. We have a regular track and a ✨collaborative benchmarking task (CBT)✨ that's fully LLM-focused this year (1/6)
genbench.org
The second workshop on generalisation (benchmarking) in NLP
1
12
22
@GenBench
GenBench
1 year
That's a wrap! We (@glnmario, @christos_c, @_dieuwke_, @vernadankers, @khuyagbaatar_b, @a_kazemnejad & @ryandcotterell) thank all presenters, authors, reviewers and attendees!! The keynotes, the cats 😻, the posters, the talks and the lively panel: it was fantasticπŸ‘ πŸ”₯
0
7
48
@najoungkim
Najoung Kim 🫠
1 year
so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! πŸŽ‰πŸͺ…πŸŽ‰ I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!
@HayleyRossLing
Hayley Ross
1 year
New paper with @najoungkim and @TeaAnd_OrCoffee testing if LLMs can draw adjective-noun inferences like humans! Turns out they often can, and even generalize to unseen combinations. But they're more optimistic about "artificial intelligence" than humans. https://t.co/u9RHG54HX7
1
6
60
@kanishkamisra
Kanishka Misra 🌊
1 year
Woohoo go tinlab! Congrats @HayleyRossLing @TeaAnd_OrCoffee @najoungkim!!
@GenBench
GenBench
1 year
Best paper!
0
2
15
@GenBench
GenBench
1 year
Congratulations!
@najoungkim
Najoung Kim 🫠
1 year
so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! πŸŽ‰πŸͺ…πŸŽ‰ I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!
0
0
3
@GenBench
GenBench
1 year
Congrats to all the authors!
0
0
2
@GenBench
GenBench
1 year
And we also have an honourable mention!
0
0
1
@GenBench
GenBench
1 year
Best paper!
2
0
7
@GenBench
GenBench
1 year
Closing remarks and best paper award by @vernadankers
1
2
12
@GenBench
GenBench
1 year
Come listen to the hot takes of our panelist in the Brickell room! Do we still need generalisation evaluation? 🧐 #GenBench2024 #EMNLP2024
0
4
15
@GenBench
GenBench
1 year
Still at the poster session? Come join us for keynote 3 by @sameer_!
0
1
5
@GenBench
GenBench
1 year
Did you miss the GenBench poster session? Don't worry we've got you, here are (nearly all) posters! πŸ˜‰ #GenBench2024 #EMNLP2024 Next up: keynote by Sameer Singh at 3!
0
2
13
@GenBench
GenBench
1 year
Last spotlight presentation: MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models https://t.co/4pyv01TbWE Unfortunately the authors couldn't make it, the work is kindly presented by their colleague Hengyi Wang πŸ™
0
1
1
@GenBench
GenBench
1 year
Continuing with Bastian Bunzeck, presenting The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns https://t.co/70kDItm3BB
1
1
3
@GenBench
GenBench
1 year
Next presenter is Jiwoo Lee, presenting MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models https://t.co/UW8x37AANT
1
0
0
@GenBench
GenBench
1 year
Second up, Maxim Kurkim presenting OmniDialog: A Multimodal Benchmark for Generalization Across Text, Visual, and Audio Modalities https://t.co/cdanQ7RAnO
1
0
1
@GenBench
GenBench
1 year
Spotlight time! Mirella Bueno on MLissard: Multilingual Long and Simple Sequential Reasoning Benchmarks https://t.co/ARmGeONz2c
1
1
3
@GenBench
GenBench
1 year
@kylelostat Plus more cat pictures! 😻😻
0
0
1
@GenBench
GenBench
1 year
@kylelostat He got all the room snickering already at slide 3! 😁
1
0
2
@GenBench
GenBench
1 year
Join us for our second keynote by Olmo co-lead @kylelostat
1
4
16
@GenBench
GenBench
1 year
0
2
2