faridlazuarda Profile Banner
Farid Adilazuarda Profile
Farid Adilazuarda

@faridlazuarda

Followers
281
Following
41K
Media
162
Statuses
3K

Incoming PhD @EdinburghNLP @InfAtEd • Efficient & Multilingual LLMs • prev: @mbzuai @itbofficial

Joined October 2017
Don't wanna be here? Send us removal request.
@faridlazuarda
Farid Adilazuarda
3 months
Can English-finetuned LLMs reason in other languages?. Short Answer: Yes, thanks to “quote-and-think” + test-time scaling. You can even force them to reason in a target language!. But:.🌐 Low-resource langs & non-STEM topics still tough. New paper:
Tweet card summary image
arxiv.org
Reasoning capabilities of large language models are primarily studied for English, even when pretrained models are multilingual. In this work, we investigate to what extent English reasoning...
@yong_zhengxin
Yong Zheng-Xin (Yong)
3 months
📣 New paper!. We observe that reasoning language models finetuned only on English data are capable of zero-shot cross-lingual reasoning through a "quote-and-think" pattern. However, this does not mean they reason the same way across all languages or in new domains. [1/N]
Tweet media one
1
6
34
@faridlazuarda
Farid Adilazuarda
3 days
RT @DimitrisPapail: GRPO makes reasoning model yap a lot, but there's a simple fix:. Sample more responses during training, and train on th….
0
31
0
@grok
Grok
6 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
393
663
3K
@faridlazuarda
Farid Adilazuarda
3 days
RT @thefutballboy_1: Admin said fair enough😭😭.
0
2K
0
@faridlazuarda
Farid Adilazuarda
3 days
RT @barcacentre: Rashford: "All of the lads are like young, if you're like 27, 28, I'd say there's more players under 21 than past 28. In m….
0
58
0
@faridlazuarda
Farid Adilazuarda
4 days
RT @khoomeik: my gpt-oss MFUmaxxer PR is here!. ✅ cat/splice sink -> flexattn.✅ sin/cos pos embs -> complex freqs_cis.✅ moe for-loop -> gro….
0
13
0
@faridlazuarda
Farid Adilazuarda
6 days
RT @ilyasut: if you value intelligence above all other human qualities, you’re gonna have a bad time.
0
2K
0
@faridlazuarda
Farid Adilazuarda
6 days
RT @shiraeis: … bro just discovered RAG.
0
67
0
@faridlazuarda
Farid Adilazuarda
6 days
RT @JackFrostHeeHo9: @TheNewRhyme If they are your friend they wouldn’t let a disagreement ruin a friendship.
0
778
0
@faridlazuarda
Farid Adilazuarda
7 days
RT @jxmnop: curious about the training data of OpenAI's new gpt-oss models? i was too. so i generated 10M examples from gpt-oss-20b, ran….
0
522
0
@faridlazuarda
Farid Adilazuarda
8 days
RT @giffmana: Oh wow, this VLM benchmark is pure evil, and I love it!. "Vision Language Models are Biased" by @an_vo12, @taesiri, @anh_ng8,….
0
75
0
@faridlazuarda
Farid Adilazuarda
8 days
I will always cherish "ada indonesia coy!" moments like these. So proud!🇮🇩🙌
Tweet media one
@AlhamFikri
Alham Fikri Aji
8 days
And congratulations to 🇮🇩Indonesia for winning 3 Silvers (Faiz, Matthew, Luvidi) and 1 Bronze (Jayden)!. A strong debut for Indonesia’s first participation, hope to see even more in the future!
Tweet media one
0
0
10
@faridlazuarda
Farid Adilazuarda
8 days
RT @AlhamFikri: And congratulations to 🇮🇩Indonesia for winning 3 Silvers (Faiz, Matthew, Luvidi) and 1 Bronze (Jayden)!. A strong debut for….
0
88
0
@faridlazuarda
Farid Adilazuarda
9 days
RT @Guangxuan_Xiao: I've written the full story of Attention Sinks — a technical deep-dive into how the mechanism was developed and how our….
0
264
0
@faridlazuarda
Farid Adilazuarda
10 days
RT @sporadicalia: just remembered that time Noam Shazeer dropped the hardest line ever written in an ML paper
Tweet media one
0
623
0
@faridlazuarda
Farid Adilazuarda
10 days
Tweet media one
0
24K
0
@faridlazuarda
Farid Adilazuarda
11 days
RT @wenhaocha1: Deep dive into Sink Value in GPT-OSS models! .Analyzed 20B (24 layers) and 120B (36 layers) models and found (correct me if….
0
17
0
@faridlazuarda
Farid Adilazuarda
11 days
RT @gu_xiangming: I noticed that @OpenAI added learnable bias to attention logits before softmax. After softmax, they deleted the bias. Thi….
0
174
0
@faridlazuarda
Farid Adilazuarda
11 days
RT @dvruette: gpt-oss is probably the most standard MoE transformer that ever was. Couple of details worth noting:.- Uses attention sinks (….
0
77
0
@faridlazuarda
Farid Adilazuarda
12 days
RT @mprlyn: @junoaggy pls normalize manggul sound horeg kmn mn.
0
437
0
@faridlazuarda
Farid Adilazuarda
13 days
RT @dnystedt: Four TSMC 2nm fabs will be in mass production next year and monthly capacity over 60,000 wafers-per-month (wpm), media report….
0
52
0