zhengyang_42 Profile Banner
Zhengyang Tang Profile
Zhengyang Tang

@zhengyang_42

Followers
77
Following
220
Media
4
Statuses
18

PhD candidate @cuhksz, Intern @Alibaba_Qwen. Prev: @MSFTResearch, @TencentGlobal, @AlibabaGroup.

Shanghai, China
Joined July 2016
Don't wanna be here? Send us removal request.
@rohanpaul_ai
Rohan Paul
1 month
The paper shows a small model can get great at optimization by fixing its own reasoning with tiny hints. Shows careful data curation—not just large amounts of data—can make a big difference in how well the model learns to reason and code accurately. A 4B model rivals a 671B
7
42
211
@zhengyang_42
Zhengyang Tang
2 months
So proud of what the Qwen team is shipping! This demo perfectly showcases the efficient tool-use we focused on in our CoRT paper. And the timing couldn't be better: thrilled to announce that CoRT has been accepted to NeurIPS 2025! It's a privilege to be part of this journey from
Tweet card summary image
arxiv.org
Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when...
@Alibaba_Qwen
Qwen
2 months
🚀With Code Interpreter + Web Search, Qwen Chat can now fetch data AND visualize it in charts — instantly. Need a 7-day weather trend? Done. 🌡️📊 Try it now: https://t.co/FBpr7zfQY6
0
0
2
@zhengyang_42
Zhengyang Tang
4 months
🚀 Thrilled to announce that our paper "SCRIT: Self-Evolving LLM Critique without Human or Stronger Models" was accepted to #COLM2025! We enable LLMs to self-improve critique abilities — zero human annotations, zero stronger models needed! 🔄✨ Looking forward to meeting
@ZiniuLi
Ziniu Li
10 months
🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data
1
3
8
@zhengyang_42
Zhengyang Tang
5 months
Happy to share that our paper "Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion" has been accepted to #ACL2025 as oral & panel presentation (25 out of 3000 accepted papers = top 0.8%)! 🎉 🚀 We introduce AceGPT with Progressive Vocabulary
Tweet card summary image
arxiv.org
This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art...
0
0
1
@iScienceLuvr
Tanishq Mathew Abraham, Ph.D.
5 months
CoRT: Code-integrated Reasoning within Thinking "This paper introduces CoRT, a post-training framework for teaching LRMs to leverage Code Interpreter effectively and efficiently." "We manually create 30 high-quality samples, upon which we post-train models ranging from 1.5B to
3
18
126
@zhengyang_42
Zhengyang Tang
5 months
We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”! 🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning. 🔍 Key Highlights: Strategic hint
1
3
23
@_akhaliq
AK
6 months
Learning from Peers in Reasoning Models Large Reasoning Models often get stuck when they start reasoning incorrectly (the "Prefix Dominance Trap"). Propose LeaP (Learning from Peers), a method where parallel reasoning paths share intermediate summaries to learn from each other
5
26
106
@zhengyang_42
Zhengyang Tang
6 months
Super excited to have been part of the Qwen3 team! We just dropped our technical report - check it out if you're interested in what's under the hood. Hope it helps with your projects and research. Let us know what you think! #Qwen3 #AI
@Alibaba_Qwen
Qwen
6 months
Please check out our Qwen3 Technical Report. 👇🏻 https://t.co/gOkLBCAce6
0
1
3
@zhengyang_42
Zhengyang Tang
6 months
Thrilled to share our paper "ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling" has been accepted by Operations Research! 🎉 This is the FIRST LLM paper in the 70+ year history of this prestigious journal. Our framework improves modeling
0
5
10
@ZiniuLi
Ziniu Li
10 months
🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data
2
9
67
@zhengyang_42
Zhengyang Tang
10 months
📢 Introducing SCRIT: A framework enabling LLMs to self-evolve their critique abilities without human annotations or stronger models. 💡 Key features: • Contrastive self-critic • Mathematical validity check • Zero external supervision 🔗 Paper: https://t.co/0kFnmFu74h
0
7
18
@qx_dong
Qingxiu Dong
1 year
OpenAI o1 scores 94.8% on MATH dataset😲 Then...how should we proceed to track and evaluate the next-gen LLMs' math skills? 👉Omni-Math: a new, challenging benchmark with 4k competition-level problems, where OpenAI o1-mini only achieves 60.54 acc Paper: https://t.co/Qggc7paGwe
10
23
134
@zhengyang_42
Zhengyang Tang
1 year
🚀 Launching ORLM: the first open-source Operations Research LLM, powered by our OR-Instruct process! 🛠️ 🏆 ORLMs achieves SOTA on NL4OPT, MAMO, & the new IndustryOR benchmarks based on different 7b backbones! 📄 Paper: https://t.co/Us5cPGbG5n 💻 Code: https://t.co/T1stsB2dAR
0
4
9
@_akhaliq
AK
2 years
MathScale Scaling Instruction Tuning for Mathematical Reasoning Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving. However, their proficiency in solving mathematical problems remains inadequate.
3
30
114
@arankomatsuzaki
Aran Komatsuzaki
2 years
Microsoft presents GLAN (Generalized Instruction Tuning) Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models GLAN excels without using task-specific training data https://t.co/RZBho2n5i5
8
56
317