GenBench @GenBench X Profile

GenBench

@GenBench

Followers

442

Following

104

Media

72

Statuses

193

State-of-the-art generalisation testing in NLP. Tag us for a RT of your NLP generalisation paper tweet!

https://t.co/DrUBWsGLKR

Joined April 2022

Don't wanna be here? Send us removal request.

GenBench

@GenBench

2 years

The GenBench workshop is back! Do you work on generalisation (benchmarking) in #NLProc? Submit to the 2nd edition ( https://t.co/XqMMYRW8vQ) co-located with #EMNLP2024. We have a regular track and a ✨collaborative benchmarking task (CBT)✨ that's fully LLM-focused this year (1/6)

genbench.org

The second workshop on generalisation (benchmarking) in NLP

1

12

22

GenBench

@GenBench

1 year

That's a wrap! We (@glnmario, @christos_c, @_dieuwke_, @vernadankers, @khuyagbaatar_b, @a_kazemnejad & @ryandcotterell) thank all presenters, authors, reviewers and attendees!! The keynotes, the cats 😻, the posters, the talks and the lively panel: it was fantastic👏 🔥

0

7

48

Najoung Kim 🫠

@najoungkim

1 year

so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!

Hayley Ross

@HayleyRossLing

1 year

New paper with @najoungkim and @TeaAnd_OrCoffee testing if LLMs can draw adjective-noun inferences like humans! Turns out they often can, and even generalize to unseen combinations. But they're more optimistic about "artificial intelligence" than humans. https://t.co/u9RHG54HX7

1

6

60

Kanishka Misra 🌊

@kanishkamisra

1 year

Woohoo go tinlab! Congrats @HayleyRossLing @TeaAnd_OrCoffee @najoungkim!!

GenBench

@GenBench

1 year

Best paper!

0

2

15

GenBench

@GenBench

1 year

Congratulations!

Najoung Kim 🫠

@najoungkim

1 year

so proud of @HayleyRossLing for getting a best paper award at @GenBench this year!! 🎉🪅🎉 I'm sure @TeaAnd_OrCoffee would be too :) check out our paper and share if you think homemade cats are cats!

0

3

GenBench

@GenBench

1 year

Congrats to all the authors!

0

2

GenBench

@GenBench

1 year

And we also have an honourable mention!

0

1

GenBench

@GenBench

1 year

Best paper!

2

0

7

GenBench

@GenBench

1 year

Closing remarks and best paper award by @vernadankers

1

2

12

GenBench

@GenBench

1 year

Come listen to the hot takes of our panelist in the Brickell room! Do we still need generalisation evaluation? 🧐 #GenBench2024 #EMNLP2024

0

4

15

GenBench

@GenBench

1 year

Still at the poster session? Come join us for keynote 3 by @sameer_!

0

1

5

GenBench

@GenBench

1 year

Did you miss the GenBench poster session? Don't worry we've got you, here are (nearly all) posters! 😉 #GenBench2024 #EMNLP2024 Next up: keynote by Sameer Singh at 3!

0

2

13

GenBench

@GenBench

1 year

Last spotlight presentation: MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models https://t.co/4pyv01TbWE Unfortunately the authors couldn't make it, the work is kindly presented by their colleague Hengyi Wang 🙏

0

1

GenBench

@GenBench

1 year

Continuing with Bastian Bunzeck, presenting The SlayQA benchmark of social reasoning: testing gender-inclusive generalization with neopronouns https://t.co/70kDItm3BB

1

3

GenBench

@GenBench

1 year

Next presenter is Jiwoo Lee, presenting MultiPragEval: Multilingual Pragmatic Evaluation of Large Language Models https://t.co/UW8x37AANT

1

0

GenBench

@GenBench

1 year

Second up, Maxim Kurkim presenting OmniDialog: A Multimodal Benchmark for Generalization Across Text, Visual, and Audio Modalities https://t.co/cdanQ7RAnO