Cheng Luo @ChengLuo_lc X Profile

Cheng Luo

@ChengLuo_lc

Followers

18

Following

57

Media

5

Statuses

20

Independent Researhcer

San Diego

Joined January 2020

Don't wanna be here? Send us removal request.

Cheng Luo

@ChengLuo_lc

5 days

🌟 Announcing the 1st Workshop on Efficient Reasoning (ER) at @NeurIPSConf 2025 — Dec 6 or 7, San Diego !. 📣 We welcome submissions! Submit your work here: .🗓️ Deadline: September 1, 2025 (AoE) .🔗 Website: 💬 Topics

0

1

Cheng Luo

@ChengLuo_lc

2 months

RT @InfiniAILab: 🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multivers….

0

82

0

Cheng Luo

@ChengLuo_lc

2 months

RT @InfiniAILab: 🥳 Happy to share our new work – Kinetics: Rethinking Test-Time Scaling Laws. 🤔How to effectively build a powerful reasoni….

0

69

0

Cheng Luo

@ChengLuo_lc

5 months

RT @AnimaAnandkumar: HeadInfer: Unlocking Long-Context LLM Inference on Consumer GPUs (Million-level Tokens).*long-context inputs require l….

0

14

0

Cheng Luo

@ChengLuo_lc

8 months

Dec. 10-Dec. 15 in NeurIPS’24. Our poster will be presented Friday 11 am.See u in Vancouver!

0

1

Cheng Luo

@ChengLuo_lc

10 months

RT @ChengLuo_lc: 🤩🤩 we introduce MST, a memory-efficient transformer, reducing intermediate memory usage and enabling longer sequence train….

0

1

0

Cheng Luo

@ChengLuo_lc

10 months

Our MST got accepted by NIPS 2024.

0

Cheng Luo

@ChengLuo_lc

10 months

🤩🤩 we introduce MST, a memory-efficient transformer, reducing intermediate memory usage and enabling longer sequence training without compromising performance. 🚀🚀 How does it work? MST partitions sequence into mini-sequences and apply activation recomputation for optimal

1

2

Cheng Luo

@ChengLuo_lc

10 months

RT @rohanpaul_ai: Really 👀 new Paper, MINI-SEQUENCE TRANSFORMER claims to extend the maximum context length of Qwen, Mistral, and Gemma-2 b….

0

4

0

Cheng Luo

@ChengLuo_lc

11 months

RT @papers_anon: Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer. Saw a 16x increase in sequence le….

0

7

0

Cheng Luo

@ChengLuo_lc

1 year

Curious about boosting context length in Llama 3.1 by 16x? 🦙 Our Mini-sequence Transformer (MST) offers insights! 🚀 MST extends context length with no performance drop. 📈 Our paper: and GitHub: #Llama #NLP #AI #llama31.