
Zihao Li
@realzihaolee
Followers
67
Following
4K
Media
48
Statuses
607
Doctoral Researcher @HelsinkiNLP | MSc @UnivHelsinkiCS | Multilingual NLP
Helsinki, Finland
Joined June 2019
Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data. #LLM.
arxiv.org
This paper investigates a critical design decision in the practice of massively multilingual continual pre-training -- the inclusion of parallel data. Specifically, we study the impact of...
0
0
5
Scaling Low-Resource MT via Synthetic Data Generation with LLMs. #LLMs.
arxiv.org
We investigate the potential of LLM-generated synthetic data for improving low-resource machine translation (MT). Focusing on seven diverse target languages, we construct a document-level...
0
0
3
Improvements in multilingual translation capabilities are noticeable. Flores-200 X-Eng 3-shots BLEU Score๐
Qwen3 models are supporting 119 languages and dialects. This extensive multilingual capability opens up new possibilities for international applications, enabling users worldwide to benefit from the power of these models.
0
0
0
GlotEval: A Test Suite for Massively Multilingual Evaluation of Large Language Models. #LLMs.
arxiv.org
Large language models (LLMs) are advancing at an unprecedented pace globally, with regions increasingly adopting these models for applications in their primary language. Evaluation of these models...
0
0
0
Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources. #LLMs.
arxiv.org
Large Language Models (LLMs) exhibit significant disparities in performance across languages, primarily benefiting high-resource languages while marginalizing underrepresented ones. Continual...
0
0
3
RT @EU_Commission: AI made in ๐ช๐บ. OpenEuroLLM, the first family of open source Large Language Models covering all EU languages, has earnedโฆ.
0
892
0
RT @petersarlin: Thrilled to co-lead Europeโs largest open source AI initiative bringing together 20 leading institutions to build open AIโฆ.
0
11
0
RT @soumithchintala: i'm comically impressed that people are coping on deepseek by spewing bizarre conspiracy theories -- despite deepseekโฆ.
0
414
0
Le Chat doesn't have an iOS/Android client so far! @MistralAI.
0
0
0
I'll present our work ๐ ๐๐จ๐ฆ๐ฉ๐๐ซ๐ข๐ฌ๐จ๐ง ๐จ๐ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ๐ข๐ง๐ ๐๐ง๐ ๐๐ซ๐๐ง๐ฌ๐ฅ๐๐ญ๐ข๐จ๐ง ๐๐ฌ ๐๐ฎ๐ฅ๐ญ๐ข๐ฅ๐ข๐ง๐ ๐ฎ๐๐ฅ ๐๐ซ๐๐ญ๐ซ๐๐ข๐ง๐ข๐ง๐ ๐๐๐ฃ๐๐๐ญ๐ข๐ฏ๐๐ฌ at #EMNLP2024๐ด.Nov 12 14-15:30 Riverfront Hall. cc @linguistickus
0
0
5
RT @HelsinkiNLP: ๐ Excited to introduce EMMA-500! ๐โจ .A multilingual model continue-trained on 546 languages, enhancing coverage for low-reโฆ.
huggingface.co
0
25
0