Infini-AI-Lab @InfiniAILab X Profile

Infini-AI-Lab

@InfiniAILab

Followers

1K

Following

49

Media

36

Statuses

66

Pittsburgh, PA

Joined September 2024

Don't wanna be here? Send us removal request.

Infini-AI-Lab

@InfiniAILab

10 days

RT @Azaliamirh: Introducing Weaver, a test time scaling method for verification! . Weaver shrinks the generation-verification gap through a….

0

47

0

Infini-AI-Lab

@InfiniAILab

11 days

RT @tqchenml: #MLSys2026 will be led by the general chair @luisceze and PC chairs @JiaZhihao and @achowdhery. The conference will be held i….

0

12

0

Infini-AI-Lab

@InfiniAILab

15 days

RT @BeidiChen: This is cool!!!.

0

3

0

Infini-AI-Lab

@InfiniAILab

16 days

RT @tydsh: Great to see a lot of interest! It takes some time to construct the superpositional encoding correctly, and to make it compatibl….

0

6

0

Infini-AI-Lab

@InfiniAILab

16 days

RT @togethercompute: 🐳DeepSeek-R1 just got more accessible. Introducing our new cost-optimized endpoint for DeepSeek-R1 0528: .✨ High-quali….

0

15

0

Infini-AI-Lab

@InfiniAILab

17 days

RT @BeidiChen: wow 🤩 check this out!!!.

0

9

0

Infini-AI-Lab

@InfiniAILab

18 days

RT @SonglinYang4: Recordings: Slides:

0

6

0

Infini-AI-Lab

@InfiniAILab

18 days

A case study of GRESO’s data selection for rollouts:.Each row represents a prompt, and each column represents an epoch. GRESO adaptively skips easy, medium, and difficult prompts prior to rollout throughout training, applying dynamic selection strategies as the model evolves.

0

1

4

Infini-AI-Lab

@InfiniAILab

18 days

GRESO achieves comparable final model performance with fewer rollouts, largely saving RL training time.

1

5

Infini-AI-Lab

@InfiniAILab

18 days

🚀 Excited to introduce our latest work GRESO: a method that identifies and skips millions of uninformative prompts before rollout and achieves up to 2.0x wall-clock time speedup in training. More rollouts lead to better model performance, but they’re also a major bottleneck in

1

30

165

Infini-AI-Lab

@InfiniAILab

19 days

RT @SonglinYang4: Tomorrow at 2 PM Eastern Time, the ASAP seminar will feature @Xinyu2ML presenting an exciting work on parallel reasoning.….

0

6

0

Infini-AI-Lab

@InfiniAILab

19 days

RT @SonglinYang4: @Xinyu2ML will be presenting this amazing work at ASAP seminar tomorrow! Do not miss his talk

0

4

0

Infini-AI-Lab

@InfiniAILab

20 days

Kudos to @Oasis_a19, Hongyi Liu, @tqchenml, and our advisor @BeidiChen !.

0

1

6

Infini-AI-Lab

@InfiniAILab

20 days

We will also give an online talk about Multiverse at ASAP Seminars ( on June 18th (this Wednesday), 2:00 PM Eastern Time. Please feel free to join us if you are interested!. 🧵 12/n

1

2

8

Infini-AI-Lab

@InfiniAILab

20 days

For more details, please refer to:.📜 Paper: 🧑‍💻 Code: 🤗 Huggingface: We are actively developing new models based on Multiverse. Please follow @Multiverse4FM for future updates. 🧵 11/n.

1

2

6

Infini-AI-Lab

@InfiniAILab

20 days

Multiverse unlocks a different paradigm to enable parallel generation. It significantly boosts GPU utilization by enabling massive parallel generation, which further showcases the potential to address extremely complex tasks in practice @NVIDIAAI @intel @AMD @MOFFET_AI @GroqInc

1

7

Infini-AI-Lab

@InfiniAILab

20 days

🚀 This is because Multiverse-32B can generate more tokens within the same time, achieving up to 2x wall-clock speedup per generated token while maintaining effective scaling across variable batch sizes. 🧵 9/n

1

6

Infini-AI-Lab

@InfiniAILab

20 days

🚀 Moreover, Multiverse-32B exhibits more efficient test-time scaling by parallel generation, outperforming AR-LLMs by 1.87% using the same generation time. 🧵 8/n

1

6

Infini-AI-Lab

@InfiniAILab

20 days

🚀 Based on this pipeline, we develop Multiverse-32B by applying supervised fine-tuning (SFT) to Qwen-2.5-32B-Instruct using only 1K examples within 3 hours. 🧠 Multiverse-32B achieves significant improvements in reasoning ability, outperforming the base model by 24%, with

1

6

Infini-AI-Lab

@InfiniAILab

20 days

💡 We offer a pipeline to answer this question through the co-design of data, algorithm, and system, including:. 💠 Multiverse Curator: An automated LLM-assisted pipeline that converts sequential chains into Multiverse structures.💠 Multiverse Attention: A new attention mechanism

1

2

7