Zhengyang Tang
@zhengyang_42
Followers
77
Following
220
Media
4
Statuses
18
PhD candidate @cuhksz, Intern @Alibaba_Qwen. Prev: @MSFTResearch, @TencentGlobal, @AlibabaGroup.
Shanghai, China
Joined July 2016
The paper shows a small model can get great at optimization by fixing its own reasoning with tiny hints. Shows careful data curation—not just large amounts of data—can make a big difference in how well the model learns to reason and code accurately. A 4B model rivals a 671B
7
42
211
So proud of what the Qwen team is shipping! This demo perfectly showcases the efficient tool-use we focused on in our CoRT paper. And the timing couldn't be better: thrilled to announce that CoRT has been accepted to NeurIPS 2025! It's a privilege to be part of this journey from
arxiv.org
Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when...
🚀With Code Interpreter + Web Search, Qwen Chat can now fetch data AND visualize it in charts — instantly. Need a 7-day weather trend? Done. 🌡️📊 Try it now: https://t.co/FBpr7zfQY6
0
0
2
🚀 Thrilled to announce that our paper "SCRIT: Self-Evolving LLM Critique without Human or Stronger Models" was accepted to #COLM2025! We enable LLMs to self-improve critique abilities — zero human annotations, zero stronger models needed! 🔄✨ Looking forward to meeting
🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data
1
3
8
Happy to share that our paper "Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion" has been accepted to #ACL2025 as oral & panel presentation (25 out of 3000 accepted papers = top 0.8%)! 🎉 🚀 We introduce AceGPT with Progressive Vocabulary
arxiv.org
This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art...
0
0
1
CoRT: Code-integrated Reasoning within Thinking "This paper introduces CoRT, a post-training framework for teaching LRMs to leverage Code Interpreter effectively and efficiently." "We manually create 30 high-quality samples, upon which we post-train models ranging from 1.5B to
3
18
126
We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”! 🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning. 🔍 Key Highlights: Strategic hint
1
3
23
Learning from Peers in Reasoning Models Large Reasoning Models often get stuck when they start reasoning incorrectly (the "Prefix Dominance Trap"). Propose LeaP (Learning from Peers), a method where parallel reasoning paths share intermediate summaries to learn from each other
5
26
106
Thrilled to share our paper "ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling" has been accepted by Operations Research! 🎉 This is the FIRST LLM paper in the 70+ year history of this prestigious journal. Our framework improves modeling
0
5
10
🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data
2
9
67
📢 Introducing SCRIT: A framework enabling LLMs to self-evolve their critique abilities without human annotations or stronger models. 💡 Key features: • Contrastive self-critic • Mathematical validity check • Zero external supervision 🔗 Paper: https://t.co/0kFnmFu74h
0
7
18
OpenAI o1 scores 94.8% on MATH dataset😲 Then...how should we proceed to track and evaluate the next-gen LLMs' math skills? 👉Omni-Math: a new, challenging benchmark with 4k competition-level problems, where OpenAI o1-mini only achieves 60.54 acc Paper: https://t.co/Qggc7paGwe
10
23
134
🚀 Launching ORLM: the first open-source Operations Research LLM, powered by our OR-Instruct process! 🛠️ 🏆 ORLMs achieves SOTA on NL4OPT, MAMO, & the new IndustryOR benchmarks based on different 7b backbones! 📄 Paper: https://t.co/Us5cPGbG5n 💻 Code: https://t.co/T1stsB2dAR
0
4
9
MathScale Scaling Instruction Tuning for Mathematical Reasoning Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving. However, their proficiency in solving mathematical problems remains inadequate.
3
30
114
Microsoft presents GLAN (Generalized Instruction Tuning) Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models GLAN excels without using task-specific training data https://t.co/RZBho2n5i5
8
56
317
DPTDR: Deep Prompt Tuning for Dense Passage Retrieval https://t.co/v42EJEeUvP by @zhengyang_42 et al. including @wabyking
#NaturalLanguageProcessing #Computation
deepai.org
08/24/22 - Deep prompt tuning (DPT) has gained great success in most natural language processing (NLP) tasks. However, it is not well-invest...
0
2
8