sriniiyer88 Profile Banner
Srini Iyer Profile
Srini Iyer

@sriniiyer88

Followers
1K
Following
636
Media
18
Statuses
139

Research Scientist at Facebook AI Research

Seattle, WA
Joined February 2012
Don't wanna be here? Send us removal request.
@sriniiyer88
Srini Iyer
7 months
New paper! Byte-Level models are finally competitive with tokenizer-based models with better inference efficiency and robustness! Dynamic patching is the answer! Read all about it here:.(1/n).
1
22
87
@sriniiyer88
Srini Iyer
7 days
Turns out, if you teach llamas how to self-reflect and backtrack from wrong reasoning paths, it does extra well on math reasoning!. - MATH 500: 65.8% ➡️ 81.8%.- AMC 23: 37.5% ➡️ 64.4%.- AIME 24: 10% ➡️ 30%. Amazing work by @danieljwkim, can be a nice long weekend read!.
@danieljwkim
Joongwon Kim
7 days
Can we improve Llama 3’s reasoning abilities through post-training only?.Introducing ASTRO, our new framework that teaches LLMs to perform in-context search and generate long CoT to solve math problems, via SFT and RL. Work done at @aiatmeta. 📄 Paper:
1
13
67
@sriniiyer88
Srini Iyer
2 months
This is exciting! Check out our new step-by-step playbook that shows how to do MoT on top of your existing transformer implementation! Also, MoT is now in TMLR! Huge congrats to @liang_weixin, @VictoriaLinML and others!.
@liang_weixin
Weixin Liang
2 months
🎉 Excited to share: "𝐌𝐢𝐱𝐭𝐮𝐫𝐞-𝐨𝐟-𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬 (𝐌𝐨𝐓)" has been officially accepted to TMLR (March 2025) and the code is now open-sourced!. 📌 GitHub repo: 📄 Paper: How can we reduce pretraining costs for
Tweet media one
Tweet media two
1
1
4
@sriniiyer88
Srini Iyer
3 months
RT @jffwng: We just released model weights for our 1B & 8B-parameter BLT: Byte Latent Transformer, token-less model with sig. improvements….
0
78
0
@sriniiyer88
Srini Iyer
3 months
RT @EntilZhaPR: By popular demand (see our GH issues 😅), we're releasing 1B and 8B weights for our BLT models! We're also hard at work at a….
0
6
0
@sriniiyer88
Srini Iyer
3 months
RT @AIatMeta: 🚀 Meta FAIR is releasing several new research artifacts on our road to advanced machine intelligence (AMI). These latest adva….
0
236
0
@sriniiyer88
Srini Iyer
3 months
RT @gargighosh: Excited to share that we are open sourcing BLT model weights by popular demand(Code was open sourced already): https://t.co….
0
6
0
@sriniiyer88
Srini Iyer
3 months
Huge thanks to @EntilZhaPR , @gargighosh, @ArtidoroPagnoni, @LukeZettlemoyer for this release!.
0
0
3
@sriniiyer88
Srini Iyer
3 months
BLT model weights are out!. Responding to popular demand, we just open-sourced model weights for our 1B and 8B BLT models for the research community to play with! . Hoping to see many new and improved BLT based architectures this year!.
3
20
72
@sriniiyer88
Srini Iyer
6 months
We're hiring PhD interns for Summer 2025 in Seattle to work with us on improving BLT even more! If this is something that excites you, reach out to me on dm/email asap!.
@AIatMeta
AI at Meta
6 months
New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️
Tweet media one
4
29
315
@sriniiyer88
Srini Iyer
6 months
BLT related post by Meta AI - eliminate all tokenization once and for all!.
@AIatMeta
AI at Meta
6 months
New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️
Tweet media one
0
3
11
@sriniiyer88
Srini Iyer
7 months
RT @dimitrizho: Meta's Byte Latent Transformer (BLT) paper looks like the real-deal. Outperforming tokenization models even up to their tes….
0
2
0
@sriniiyer88
Srini Iyer
7 months
RT @edkesuma: Gm. Woke up to a new paper on Byte Latent Transformers (BLT). Now you can increase model size without increasing inference….
0
3
0
@sriniiyer88
Srini Iyer
7 months
RT @PowerSystemAuto: Meta AI's Byte Latent Transformer (BLT) is revolutionizing the tokenization process, enhancing scalability and efficie….
0
1
0
@sriniiyer88
Srini Iyer
7 months
RT @Smol_AI: [13 Dec 2024]. Meta BLT: Tokenizer-free, Byte-level LLM. a few months ago @karpathy noted that tokeni….
0
7
0
@sriniiyer88
Srini Iyer
7 months
RT @ZainHasan6: Pretty cool work on tokenization-less transformer from Meta!. > Byte Latent Transformer (BLT), byte-level LLM architecture,….
0
5
0
@sriniiyer88
Srini Iyer
7 months
RT @AkshatS07: Been waiting for this one, a strong step in removing tokenization from LLMs. Congrats to the team!.
0
3
0
@sriniiyer88
Srini Iyer
7 months
RT @jmbollenbacher_: This could be one of the biggest AI papers of the year, if it really works as well as they report in this paper. It's….
0
3
0
@sriniiyer88
Srini Iyer
7 months
RT @_xjdr: Llamas . Tokenizer Free?! USING ENTROPY STEERING?!?!! . sometimes the universe conspires to make a paper just for you and it f….
0
39
0
@sriniiyer88
Srini Iyer
7 months
RT @scaling01: I can rest now🥲.I have gathered all the infinity stones. thanks @karpathy
Tweet media one
0
12
0
@sriniiyer88
Srini Iyer
7 months
RT @liliyu_lili: We scaled up Megabyte and ended up with a BLT! . A pure byte-level model, has a steeper scaling law than the BPE-based mod….
0
10
0