Martin Fajčík @martin_fajcik tweet - 🚨 Introducing BenCzechMark (BCM) 🇨🇿—the 1st multitask & multimetric Czech benchmark for large language models! 🧠 🔗 Check out the leaderboard: https://t.co/3nFANXN35i 📖 Read more in our Hugging Face blog: https://t.co/wxnD7BMoKn #NLP #AI #CzechLanguage #LLM https://t.co/OFQD7xTM4u

Martin Fajčík

@martin_fajcik

1 year

🚨 Introducing BenCzechMark (BCM) 🇨🇿—the 1st multitask & multimetric Czech benchmark for large language models! 🧠 🔗 Check out the leaderboard: https://t.co/3nFANXN35i 📖 Read more in our Hugging Face blog: https://t.co/wxnD7BMoKn #NLP #AI #CzechLanguage #LLM

Replies

Martin Fajčík

@martin_fajcik

1 year

✨ 50 tasks 📚 9 categories 📋 - Covering domains from historical Czech to language learner essays & spoken word 🔢 26 submitted systems currently 📊 Unique duel scoring based on statistical significance!

Martin Fajčík

@martin_fajcik

1 year

👑 Llama-450B currently reigns supreme in BenCzechMark! But it’s not unbeatable—other models shine in specific categories like Math and Sentiment. 📊

Martin Fajčík

@martin_fajcik

1 year

- Qwen-72B shone in Math and Historical IR but lagged behind similarly-sized models in other categories. - Aya-23-35B model excels in Sentiment and Language Modeling, but lags behind in different categories. - Gemma-2 9B delivers excellent results in Czech reading comprehension.

Martin Fajčík

@martin_fajcik

1 year

🚀 Submit your model to BenCzechMark without going public! Our leaderboard currently features over 25 models of varying sizes, and you can test your model's performance privately—publishing is optional! 🔒

Martin Fajčík

@martin_fajcik

1 year

🌟 Interested in more information? Dive into our blog post for all the details on BenCzechMark! 📖 Stay tuned—our paper is coming soon! 📄 🔗

Martin Fajčík

@martin_fajcik

1 year

Acknowledgements This is a joint work of @VUTvBrne, @muni_cz @CIIRCCTU and @huggingface !

Martin Fajčík

@martin_fajcik

1 year

Sorry error in thos link. The right link is in the original description!