
Srini Iyer
@sriniiyer88
Followers
1K
Following
636
Media
18
Statuses
139
Research Scientist at Facebook AI Research
Seattle, WA
Joined February 2012
Turns out, if you teach llamas how to self-reflect and backtrack from wrong reasoning paths, it does extra well on math reasoning!. - MATH 500: 65.8% ➡️ 81.8%.- AMC 23: 37.5% ➡️ 64.4%.- AIME 24: 10% ➡️ 30%. Amazing work by @danieljwkim, can be a nice long weekend read!.
Can we improve Llama 3’s reasoning abilities through post-training only?.Introducing ASTRO, our new framework that teaches LLMs to perform in-context search and generate long CoT to solve math problems, via SFT and RL. Work done at @aiatmeta. 📄 Paper:
1
13
67
This is exciting! Check out our new step-by-step playbook that shows how to do MoT on top of your existing transformer implementation! Also, MoT is now in TMLR! Huge congrats to @liang_weixin, @VictoriaLinML and others!.
🎉 Excited to share: "𝐌𝐢𝐱𝐭𝐮𝐫𝐞-𝐨𝐟-𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬 (𝐌𝐨𝐓)" has been officially accepted to TMLR (March 2025) and the code is now open-sourced!. 📌 GitHub repo: 📄 Paper: How can we reduce pretraining costs for
1
1
4
RT @jffwng: We just released model weights for our 1B & 8B-parameter BLT: Byte Latent Transformer, token-less model with sig. improvements….
0
78
0
RT @EntilZhaPR: By popular demand (see our GH issues 😅), we're releasing 1B and 8B weights for our BLT models! We're also hard at work at a….
0
6
0
RT @AIatMeta: 🚀 Meta FAIR is releasing several new research artifacts on our road to advanced machine intelligence (AMI). These latest adva….
0
236
0
RT @gargighosh: Excited to share that we are open sourcing BLT model weights by popular demand(Code was open sourced already): https://t.co….
0
6
0
We're hiring PhD interns for Summer 2025 in Seattle to work with us on improving BLT even more! If this is something that excites you, reach out to me on dm/email asap!.
New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️
4
29
315
BLT related post by Meta AI - eliminate all tokenization once and for all!.
New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️
0
3
11
RT @dimitrizho: Meta's Byte Latent Transformer (BLT) paper looks like the real-deal. Outperforming tokenization models even up to their tes….
0
2
0
RT @edkesuma: Gm. Woke up to a new paper on Byte Latent Transformers (BLT). Now you can increase model size without increasing inference….
0
3
0
RT @PowerSystemAuto: Meta AI's Byte Latent Transformer (BLT) is revolutionizing the tokenization process, enhancing scalability and efficie….
0
1
0
RT @ZainHasan6: Pretty cool work on tokenization-less transformer from Meta!. > Byte Latent Transformer (BLT), byte-level LLM architecture,….
0
5
0
RT @AkshatS07: Been waiting for this one, a strong step in removing tokenization from LLMs. Congrats to the team!.
0
3
0
RT @jmbollenbacher_: This could be one of the biggest AI papers of the year, if it really works as well as they report in this paper. It's….
0
3
0
RT @_xjdr: Llamas . Tokenizer Free?! USING ENTROPY STEERING?!?!! . sometimes the universe conspires to make a paper just for you and it f….
0
39
0
RT @liliyu_lili: We scaled up Megabyte and ended up with a BLT! . A pure byte-level model, has a steeper scaling law than the BPE-based mod….
0
10
0