ArmelRandy Profile
ArmelRandy

@RandyZebaze

Followers
195
Following
3K
Media
17
Statuses
78

PhD Student @InriaParisNLP | MVA 2022 @ENS_ParisSaclay | X19 @Polytechnique

Joined February 2022
Don't wanna be here? Send us removal request.
@RandyZebaze
ArmelRandy
3 months
🎉 Grateful and happy to share that two of our papers were accepted to #EMNLP2025 Findings! 🚀 [1] Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation [2] TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation
2
1
6
@rohanpaul_ai
Rohan Paul
1 month
The paper tests whether “thinking tokens” help translation and finds they mostly do not. The team tests big reasoning models that write hidden thoughts before the answer. They compare translations with and without those thoughts. Quality barely changes across many language
1
5
28
@RandyZebaze
ArmelRandy
3 months
[1] Compositional Translation: A Novel LLM-based Approach for Low-resource Machine Translation Arxiv: https://t.co/Ff72J1esro Github: https://t.co/qTFcU3VNgr [2] TopXGen: Topic-Diverse Parallel Data Generation for Low-Resource Machine Translation Arxiv:
Tweet card summary image
arxiv.org
LLMs have been shown to perform well in machine translation (MT) with the use of in-context learning (ICL), rivaling supervised models when translating into high-resource languages (HRLs)....
0
0
0
@LydiaNishimwe
Lydia Nishimwe
5 months
🎓 I defended my PhD in Machine Translation last month! Grateful to my colleagues at @inria_paris for the support & collaboration throughout this journey. 🎯 Open to Work - AI/NLP Research Scientist or Engineer roles, starting September 2025, on-site in the Paris area or remote.
0
3
11
@slatornews
Slator
8 months
👉 https://t.co/PLHdEfGuh5 Researchers at @Inria 🇫🇷 demonstrate how to improve #AI #translation for low-resource languages by breaking ⛓️‍💥 sentences into simpler phrases, translating each using in-context examples, and using these pairs to guide translation. #xl8 #t9n #LLMs
0
1
2
@rohanpaul_ai
Rohan Paul
9 months
LLMs struggle with machine translation for low-resource languages, even with similar examples. This paper introduces Compositional Translation (CompTra). It decomposes sentences into phrases, translates each using retrieved examples, and recombines these translations for the
1
3
13
@AnthropicAI
Anthropic
9 months
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
1K
3K
19K
@LydiaNishimwe
Lydia Nishimwe
9 months
🚀 Exciting Challenge Ahead! 🚀 I'm thrilled to be one of 12 finalists in the 3-minute thesis competition (@MT180FR ) at Sorbonne Université. 🗓️March 10th, 6PM Paris 🔗Register to watch (in person or online) & vote: https://t.co/loJI8qIixr Looking forward to seeing you there!
Tweet card summary image
sorbonne-universite.fr
12 candidates et candidats vont participer à la finale du concours Ma Thèse en 180 secondes le 10 mars 2025. Voici un aperçu de leurs sujets. 
2
3
8
@RandyZebaze
ArmelRandy
10 months
TL;DR Everything is in the title. The paper is available on ArXiv https://t.co/jscBxD3fpP The code and outputs are available on Github https://t.co/AYTCIVgQfR Thanks to my co-authors @bensagot and @RABawden, and to @InriaParisNLP. 10/10
Tweet card summary image
github.com
[NAACL 2025 Findings] Example Selection via Similarity Search improves Low-resource Machine Translation - ArmelRandy/ICL-MT
0
0
1
@RandyZebaze
ArmelRandy
10 months
Finally, we demonstrate that similarity-based example selection (in a high-quality sample pool) helps few-shot MT with LLMs ranging from 2 to 70 billion parameters. As the number of in-context examples grows, the gap with random selection remains significant. 9/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
Using FLORES-200 dev set (997 human-written pairs) as our initial selection pool, we study the impact of reducing or expanding it with bitexts from the NLLB dataset. In Swahili, similarity search (notably SONAR) proves more robust to pool composition than random selection. 8/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
SONAR also outperforms example selection based on string-matching metrics like BLEU, BM25, R(rerank)-BM25, and cosine-similarity with RoBERTa's sentence representations. 7/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
Experiments with 5 sentence embeddings on 4 FLORES-200 languages show that similarity-based selection outperforms random selection in LRLs but offers only marginal gains in HRLs (French). Across both cases, sentence embeddings perform similarly, with SONAR slightly leading. 6/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
We tackle these issues by assigning a zero score to problematic generations, making the metrics language-aware. Specifically, we evaluate with Language-aware COMET, based on COMET-22. It preserves COMET's accuracy while improving the assessment of problematic outputs. 5/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
Translating into low-resource languages presents two main challenges: • Outputs may be in the wrong language (e.g., repeating the prompt). • They may be empty or contain meaningless repetitions. Current neural metrics are not robust to these issues. 4/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
We examine three aspects: • Evaluating LLM-based MT into LRLs. • Assessing whether similarity-based example selection improves MT, especially with a small pool (typical) for LRLs, and at scale. • Testing the strategy’s robustness to selection pool heterogeneity. 3/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
We explore in-context example selection for MT, focusing on LRLs (Swahili, Wolof etc. ). Given a sentence and a selection pool, we choose the k closest pairs based on a sentence embedding or a string-matching metric, placing the most similar closest to the sentence. 2/10
1
0
0
@RandyZebaze
ArmelRandy
10 months
I am happy to announce that our paper "In-context Example Selection via Similarity Search Improves Low-resource Machine Translation" was accepted to the #NAACL2025 Findings 🤩🔥. What is this about? TAGS: Machine Translation (MT), High/Low -resource languages (H/LRLs). 🧵 1/10
1
0
3
@kyutai_labs
kyutai
11 months
Meet Helium-1 preview, our 2B multi-lingual LLM, targeting edge and mobile devices, released under a CC-BY license. Start building with it today! https://t.co/X4Dbx2T1cJ
Tweet card summary image
huggingface.co
10
94
381
@rohanpaul_ai
Rohan Paul
11 months
Tree of Problems (ToP) breaks complex LLM tasks into identical subtasks, solving them like nested Russian dolls Turns massive problems into bite-sized copies. Original Problem 🤔: LLMs struggle with complex reasoning tasks that require breaking down into simpler subtasks.
7
52
260