Srini Iyer @sriniiyer88 X Profile

Srini Iyer

@sriniiyer88

Followers

1K

Following

636

Media

18

Statuses

139

Research Scientist at Facebook AI Research

Seattle, WA

Joined February 2012

Don't wanna be here? Send us removal request.

Srini Iyer

@sriniiyer88

7 months

New paper! Byte-Level models are finally competitive with tokenizer-based models with better inference efficiency and robustness! Dynamic patching is the answer! Read all about it here:.(1/n).

1

22

87

Srini Iyer

@sriniiyer88

7 days

Turns out, if you teach llamas how to self-reflect and backtrack from wrong reasoning paths, it does extra well on math reasoning!. - MATH 500: 65.8% ➡️ 81.8%.- AMC 23: 37.5% ➡️ 64.4%.- AIME 24: 10% ➡️ 30%. Amazing work by @danieljwkim, can be a nice long weekend read!.

Joongwon Kim

@danieljwkim

7 days

Can we improve Llama 3’s reasoning abilities through post-training only?.Introducing ASTRO, our new framework that teaches LLMs to perform in-context search and generate long CoT to solve math problems, via SFT and RL. Work done at @aiatmeta. 📄 Paper:

1

13

67

Srini Iyer

@sriniiyer88

2 months

This is exciting! Check out our new step-by-step playbook that shows how to do MoT on top of your existing transformer implementation! Also, MoT is now in TMLR! Huge congrats to @liang_weixin, @VictoriaLinML and others!.

Weixin Liang

@liang_weixin

2 months

🎉 Excited to share: "𝐌𝐢𝐱𝐭𝐮𝐫𝐞-𝐨𝐟-𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐞𝐫𝐬 (𝐌𝐨𝐓)" has been officially accepted to TMLR (March 2025) and the code is now open-sourced!. 📌 GitHub repo: 📄 Paper: How can we reduce pretraining costs for

1

4

Srini Iyer

@sriniiyer88

3 months

RT @jffwng: We just released model weights for our 1B & 8B-parameter BLT: Byte Latent Transformer, token-less model with sig. improvements….

0

78

0

Srini Iyer

@sriniiyer88

3 months

RT @EntilZhaPR: By popular demand (see our GH issues 😅), we're releasing 1B and 8B weights for our BLT models! We're also hard at work at a….

0

6

0

Srini Iyer

@sriniiyer88

3 months

RT @AIatMeta: 🚀 Meta FAIR is releasing several new research artifacts on our road to advanced machine intelligence (AMI). These latest adva….

0

236

0

Srini Iyer

@sriniiyer88

3 months

RT @gargighosh: Excited to share that we are open sourcing BLT model weights by popular demand(Code was open sourced already): https://t.co….

0

6

0

Srini Iyer

@sriniiyer88

3 months

Huge thanks to @EntilZhaPR , @gargighosh, @ArtidoroPagnoni, @LukeZettlemoyer for this release!.

0

3

Srini Iyer

@sriniiyer88

3 months

BLT model weights are out!. Responding to popular demand, we just open-sourced model weights for our 1B and 8B BLT models for the research community to play with! . Hoping to see many new and improved BLT based architectures this year!.

3

20

72

Srini Iyer

@sriniiyer88

6 months

We're hiring PhD interns for Summer 2025 in Seattle to work with us on improving BLT even more! If this is something that excites you, reach out to me on dm/email asap!.

AI at Meta

@AIatMeta

6 months

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️

4

29

315

Srini Iyer

@sriniiyer88

6 months

BLT related post by Meta AI - eliminate all tokenization once and for all!.

AI at Meta

@AIatMeta

6 months

New from Meta FAIR — Byte Latent Transformer: Patches Scale Better Than Tokens introduces BLT, which for the first time, matches tokenization-based LLM performance at scale with significant improvements in inference efficiency & robustness. Paper ➡️

0

3

11

Srini Iyer

@sriniiyer88

7 months

RT @dimitrizho: Meta's Byte Latent Transformer (BLT) paper looks like the real-deal. Outperforming tokenization models even up to their tes….

0

2

0

Srini Iyer

@sriniiyer88

7 months

RT @edkesuma: Gm. Woke up to a new paper on Byte Latent Transformers (BLT). Now you can increase model size without increasing inference….

0

3

0

Srini Iyer

@sriniiyer88

7 months

RT @PowerSystemAuto: Meta AI's Byte Latent Transformer (BLT) is revolutionizing the tokenization process, enhancing scalability and efficie….

0

1

0

Srini Iyer

@sriniiyer88

7 months

RT @Smol_AI: [13 Dec 2024]. Meta BLT: Tokenizer-free, Byte-level LLM. a few months ago @karpathy noted that tokeni….

0

7

0

Srini Iyer

@sriniiyer88

7 months

RT @ZainHasan6: Pretty cool work on tokenization-less transformer from Meta!. > Byte Latent Transformer (BLT), byte-level LLM architecture,….

0

5

0

Srini Iyer

@sriniiyer88

7 months

RT @AkshatS07: Been waiting for this one, a strong step in removing tokenization from LLMs. Congrats to the team!.

0

3

0

Srini Iyer

@sriniiyer88

7 months

RT @jmbollenbacher_: This could be one of the biggest AI papers of the year, if it really works as well as they report in this paper. It's….

0

3

0

Srini Iyer

@sriniiyer88

7 months

RT @_xjdr: Llamas . Tokenizer Free?! USING ENTROPY STEERING?!?!! . sometimes the universe conspires to make a paper just for you and it f….

0

39

0

Srini Iyer

@sriniiyer88

7 months

RT @scaling01: I can rest now🥲.I have gathered all the infinity stones. thanks @karpathy

0

12

0

Srini Iyer

@sriniiyer88

7 months

RT @liliyu_lili: We scaled up Megabyte and ended up with a BLT! . A pure byte-level model, has a steeper scaling law than the BPE-based mod….

0

10

0