
Dieuwke Hupkes
@_dieuwke_
Followers
2K
Following
1K
Media
85
Statuses
463
Joined September 2017
Many thanks for this big honour! š¤©.
Congratulations to the winners of theĀ 2025 IJCAIāJAIR PrizeĀ for their paperĀ āCompositionality Decomposed: How Do Neural Networks Generalise?āĀ āĀ Dieuwke Hupkes, Verna Dankers, Mathijs Mul, and Elia Bruni!Ā Ā .#IJCAI2025
2
0
25
RT @IJCAIconf: Congratulations to the winners of theĀ 2025 IJCAIāJAIR PrizeĀ for their paperĀ āCompositionality Decomposed: How Do Neural Netwā¦.
0
4
0
RT @vernadankers: Proud to accept a 5y outstanding paper award @IJCAIconf š from JAIR for the impact Compositionality Decomposed has had,ā¦.
0
3
0
RT @WiAIR_podcast: š§ What does it really mean for an LLM to generalize? And are we even measuring it right?.In the latest #WiAIR episode, wā¦.
0
1
0
Could not be more thrilled about this partnership, allowing us to keep MultiLoKo's test set truly hidden and have experts at Kaggle independently run the leaderboard šš„šŖ.
Exciting collaboration! We've partnered with @AIatMeta's to launch the MultiLoKo Benchmark, now live on our platform. Measure model performance across 31 languages with truly private holdout sets ā just like Kaggle Competitions, ensuring accurate results. Explore MultiLoKo and
0
1
9
Thrilled about the launch of this platform š¤©, the feature to host secret test sets is a deal breaker in the game against contamination and a gift to both benchmark builders and modellers š„š„ Excited to be one of the first to use it for @AIatMeta's MultiLoKo's test set šŖ!.
š Kaggle Benchmarks is here! Get competition-grade rigor for AI model evaluation. Let Kaggle handle infrastructure while you focus on AI breakthroughs. View model performance on 70+ leaderboards, including @AIatMeta's MultiLoKo. Dive in:
0
1
6
RT @kaggle: š Kaggle Benchmarks is here! Get competition-grade rigor for AI model evaluation. Let Kaggle handle infrastructure while you fā¦.
0
26
0
RT @WiAIR_podcast: How do we know if a language model really generalizes - or is just repeating patterns itās memorized?.Letās talk about cā¦.
0
1
0
RT @ryan_nefdt: My new book on Linguistic Relativity (with Jeff Pelletier) just dropped with @OUPPhilosophy: We diā¦.
academic.oup.com
Abstract. Does your language distinguish between dark and light blues? Do your verbs require a report on where and how you got your information? Can you ea
0
16
0
Want to know more? Have a look at our paper or our github repository:.- - Don't forget to check the related work section for other awesome work on multilingual evaluation for LLMs. @metaai.
github.com
A benchmark with locally sourced multilingual questions for 31 languages. - facebookresearch/multiloko
0
0
2
This pretty result confirms earlier results from a.o. Qi et. al ( @Jirui_Qi @AriannaBisazza @raquel_dmg ) and Ohmer et al ( @xenia_ohmer @eliabruni) that knowledge transfer between languages in LLMs is suboptimal.
direct.mit.edu
Abstract. The staggering pace with which the capabilities of large language models (LLMs) are increasing, as measured by a range of commonly used natural language understanding (NLU) benchmarks,...
1
0
1