Cheng Luo Profile
Cheng Luo

@ChengLuo_lc

Followers
79
Following
80
Media
5
Statuses
28

Independent Researhcer

San Diego
Joined January 2020
Don't wanna be here? Send us removal request.
@BeidiChen
Beidi Chen
19 days
The whole @InfiniAILab is at #NeurIPS this week! Our group is currently working on diverse directions of GenAI, .e.g., Scalable and Efficient RL, VideoGen, Modeling, Model Arch & Sys Co-Design (Many new releases coming!!). Come and talk to us @RJ_Sadhukhan @IronSteveZhou
0
11
107
@ChengLuo_lc
Cheng Luo
1 month
Great work! We are looking forward to see this work at our NeurIPS 2025 efficient reasoning workshop
openreview.net
Reinforcement Learning with Verifiable Rewards (RLVR) reliably improves the reasoning performance of large language models, yet it appears to modify only a small fraction of parameters. We revisit...
@zhu_hanqin41424
Hanqing Zhu
1 month
🚨 New Work! 🤔 Is RL black-box weight tinkering? 😉 No. We provably show RLVR follows a 🧭 — always updating the same off-principal regions while preserving the model's core spectra. ⚠️ Different optimization regime than SFT — SFT-era PEFT tricks can misfire(like PiSSA, the
1
0
5
@xxunhuang
Xun Huang
3 months
Glad that we are taking the opposite approach: while OpenAI is adding compute-intensive offerings with extra fees, we're making video generation less compute-intensive so everyone can interact with it in real-time. Algorithmic breakthroughs > throwing more compute.
@sama
Sam Altman
3 months
Over the next few weeks, we are launching some new compute-intensive offerings. Because of the associated costs, some features will initially only be available to Pro subscribers, and some new products will have additional fees. Our intention remains to drive the cost of
30
23
761
@ChengLuo_lc
Cheng Luo
3 months
We need more reviewers for the 1s Workshop on Efficient Reasoning(ER) at @NeurIPSConf, if you are interested, please fill out the nomination form
Tweet card summary image
docs.google.com
We strive to expand our reviewing pool by welcoming newer members of the community. We encourage nominations from senior community members as well as self-nominations from individuals who have either...
@ChengLuo_lc
Cheng Luo
5 months
🌟 Announcing the 1st Workshop on Efficient Reasoning (ER) at @NeurIPSConf 2025 — Dec 6 or 7, San Diego ! 📣 We welcome submissions! Submit your work here: https://t.co/13TumRabVh 🗓️ Deadline: September 1, 2025 (AoE) 🔗 Website: https://t.co/tcTfZ6r6lS 💬 Topics
0
5
16
@ChengLuo_lc
Cheng Luo
4 months
🌟 Reminder: Submission Deadline Approaching! 🌟 The 1st Workshop on Efficient Reasoning (ER) @ NeurIPS 2025 — happening Dec 6 or 7 in San Diego — is fast approaching, and we’d love to see your work there! 📌 Submission Deadline: September 1, 2025 (AoE) 🔗 Submit here:
openreview.net
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop ER
@ChengLuo_lc
Cheng Luo
5 months
🌟 Announcing the 1st Workshop on Efficient Reasoning (ER) at @NeurIPSConf 2025 — Dec 6 or 7, San Diego ! 📣 We welcome submissions! Submit your work here: https://t.co/13TumRabVh 🗓️ Deadline: September 1, 2025 (AoE) 🔗 Website: https://t.co/tcTfZ6r6lS 💬 Topics
0
4
27
@ChengLuo_lc
Cheng Luo
4 months
@NeurIPSConf 🌟 Reminder: Submission Deadline Approaching! 🌟 The 1st Workshop on Efficient Reasoning (ER) @ NeurIPS 2025 — happening Dec 6 or 7 in San Diego — is fast approaching, and we’d love to see your work there! 📌 Submission Deadline: September 1, 2025 (AoE) 🔗 Submit here:
openreview.net
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop ER
0
1
1
@ChengLuo_lc
Cheng Luo
5 months
🌟 Announcing the 1st Workshop on Efficient Reasoning (ER) at @NeurIPSConf 2025 — Dec 6 or 7, San Diego ! 📣 We welcome submissions! Submit your work here: https://t.co/13TumRabVh 🗓️ Deadline: September 1, 2025 (AoE) 🔗 Website: https://t.co/tcTfZ6r6lS 💬 Topics
1
2
2
@ChengLuo_lc
Cheng Luo
4 months
@NeurIPSConf 🌟 Reminder: Submission Deadline Approaching! 🌟 The 1st Workshop on Efficient Reasoning (ER) @ NeurIPS 2025 — happening Dec 6 or 7 in San Diego — is fast approaching, and we’d love to see your work there! 📌 Submission Deadline: September 1, 2025 (AoE) 🔗 Submit here:
openreview.net
Welcome to the OpenReview homepage for NeurIPS 2025 Workshop ER
0
1
1
@ChengLuo_lc
Cheng Luo
5 months
🌟 Announcing the 1st Workshop on Efficient Reasoning (ER) at @NeurIPSConf 2025 — Dec 6 or 7, San Diego ! 📣 We welcome submissions! Submit your work here: https://t.co/13TumRabVh 🗓️ Deadline: September 1, 2025 (AoE) 🔗 Website: https://t.co/tcTfZ6r6lS 💬 Topics
1
2
2
@InfiniAILab
Infini-AI-Lab
6 months
🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: https://t.co/J9osByhWUf 🧵 1/n
6
85
220
@InfiniAILab
Infini-AI-Lab
7 months
🥳 Happy to share our new work –  Kinetics: Rethinking Test-Time Scaling Laws 🤔How to effectively build a powerful reasoning agent? Existing compute-optimal scaling laws suggest 64K thinking tokens + 1.7B model > 32B model. But, It only shows half of the picture! 🚨 The O(N²)
7
72
248
@AnimaAnandkumar
Prof. Anima Anandkumar
10 months
HeadInfer: Unlocking Long-Context LLM Inference on Consumer GPUs (Million-level Tokens) *long-context inputs require large GPU memory. *A standard LLM like Llama-3–8B requires 207GB of GPU memory for 1 million tokens — far beyond the capabilities of consumer GPUs like the RTX
Tweet card summary image
arxiv.org
Transformer-based large language models (LLMs) demonstrate impressive performance in long context generation. Extending the context length has disproportionately shifted the memory footprint of...
2
13
88
@ChengLuo_lc
Cheng Luo
1 year
Dec. 10-Dec. 15 in NeurIPS’24. Our poster will be presented Friday 11 am.See u in Vancouver!
0
0
1
@ChengLuo_lc
Cheng Luo
1 year
🤩🤩 we introduce MST, a memory-efficient transformer, reducing intermediate memory usage and enabling longer sequence training without compromising performance. 🚀🚀 How does it work? MST partitions sequence into mini-sequences and apply activation recomputation for optimal
1
1
2
@ChengLuo_lc
Cheng Luo
1 year
Our MST got accepted by NIPS 2024
0
0
0
@ChengLuo_lc
Cheng Luo
1 year
🤩🤩 we introduce MST, a memory-efficient transformer, reducing intermediate memory usage and enabling longer sequence training without compromising performance. 🚀🚀 How does it work? MST partitions sequence into mini-sequences and apply activation recomputation for optimal
1
1
2
@rohanpaul_ai
Rohan Paul
1 year
Really 👀 new Paper, MINI-SEQUENCE TRANSFORMER claims to extend the maximum context length of Qwen, Mistral, and Gemma-2 by 12-24x. MST enables efficient long-sequence training by reducing intermediate memory overhead It achieves 2.7x improvement in perplexity with 30k context
2
4
10
@papers_anon
PapersAnon
1 year
Training Ultra Long Context Language Model with Fully Pipelined Distributed Transformer Saw a 16x increase in sequence length training size with 55% MFU. Agnostic with ZeRO and FSDP memory optimization techniques. Links below
1
7
58
@ChengLuo_lc
Cheng Luo
1 year
Curious about boosting context length in Llama 3.1 by 16x? 🦙 Our Mini-sequence Transformer (MST) offers insights! 🚀 MST extends context length with no performance drop. 📈 Our paper: https://t.co/rOOl4VVYEw and GitHub: https://t.co/hen76AwNVV. #Llama #NLP #AI #llama31
Tweet card summary image
github.com
Contribute to wdlctc/mini-s development by creating an account on GitHub.
0
1
1
@AnimaAnandkumar
Prof. Anima Anandkumar
1 year
Introducing long-context transformer using mini sequences. It is a simple and effective method for highly efficient and accurate LLM training with extremely long sequences. Our research demonstrates that the Llama3-8B model can be trained with context lengths up to 60k tokens on
1
8
47