Zhengyang Tang @zhengyang_42 X Profile

Zhengyang Tang

@zhengyang_42

Followers

77

Following

220

Media

4

Statuses

18

PhD candidate @cuhksz, Intern @Alibaba_Qwen. Prev: @MSFTResearch, @TencentGlobal, @AlibabaGroup.

https://t.co/psTmG3jUzt

Shanghai, China

Joined July 2016

Don't wanna be here? Send us removal request.

Rohan Paul

@rohanpaul_ai

1 month

The paper shows a small model can get great at optimization by fixing its own reasoning with tiny hints. Shows careful data curation—not just large amounts of data—can make a big difference in how well the model learns to reason and code accurately. A 4B model rivals a 671B

7

42

211

Zhengyang Tang

@zhengyang_42

2 months

So proud of what the Qwen team is shipping! This demo perfectly showcases the efficient tool-use we focused on in our CoRT paper. And the timing couldn't be better: thrilled to announce that CoRT has been accepted to NeurIPS 2025! It's a privilege to be part of this journey from

arxiv.org

Large Reasoning Models (LRMs) like o1 and DeepSeek-R1 have shown remarkable progress in natural language reasoning with long chain-of-thought (CoT), yet they remain inefficient or inaccurate when...

Qwen

@Alibaba_Qwen

2 months

🚀With Code Interpreter + Web Search, Qwen Chat can now fetch data AND visualize it in charts — instantly. Need a 7-day weather trend? Done. 🌡️📊 Try it now: https://t.co/FBpr7zfQY6

0

2

Zhengyang Tang

@zhengyang_42

4 months

🚀 Thrilled to announce that our paper "SCRIT: Self-Evolving LLM Critique without Human or Stronger Models" was accepted to #COLM2025! We enable LLMs to self-improve critique abilities — zero human annotations, zero stronger models needed! 🔄✨ Looking forward to meeting

Ziniu Li

@ZiniuLi

10 months

🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data

1

3

8

Zhengyang Tang

@zhengyang_42

5 months

Happy to share that our paper "Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion" has been accepted to #ACL2025 as oral & panel presentation (25 out of 3000 accepted papers = top 0.8%)! 🎉 🚀 We introduce AceGPT with Progressive Vocabulary

arxiv.org

This paper addresses the critical need for democratizing large language models (LLM) in the Arab world, a region that has seen slower progress in developing models comparable to state-of-the-art...

0

1

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

5 months

CoRT: Code-integrated Reasoning within Thinking "This paper introduces CoRT, a post-training framework for teaching LRMs to leverage Code Interpreter effectively and efficiently." "We manually create 30 high-quality samples, upon which we post-train models ranging from 1.5B to

3

18

126

Zhengyang Tang

@zhengyang_42

5 months

We’re excited to share our new paper “CoRT: Code-integrated Reasoning within Thinking”! 🤖 A post-training framework that teaches Large Reasoning Models (LRMs) to better leverage Code Interpreters for enhanced mathematical reasoning. 🔍 Key Highlights: Strategic hint

1

3

23

AK

@_akhaliq

6 months

Learning from Peers in Reasoning Models Large Reasoning Models often get stuck when they start reasoning incorrectly (the "Prefix Dominance Trap"). Propose LeaP (Learning from Peers), a method where parallel reasoning paths share intermediate summaries to learn from each other

5

26

106

Zhengyang Tang

@zhengyang_42

6 months

Super excited to have been part of the Qwen3 team! We just dropped our technical report - check it out if you're interested in what's under the hood. Hope it helps with your projects and research. Let us know what you think! #Qwen3 #AI

Qwen

@Alibaba_Qwen

6 months

Please check out our Qwen3 Technical Report. 👇🏻 https://t.co/gOkLBCAce6

0

1

3

Zhengyang Tang

@zhengyang_42

6 months

Thrilled to share our paper "ORLM: A Customizable Framework in Training Large Models for Automated Optimization Modeling" has been accepted by Operations Research! 🎉 This is the FIRST LLM paper in the 70+ year history of this prestigious journal. Our framework improves modeling

0

5

10

Ziniu Li

@ZiniuLi

10 months

🚀 Critique abilities are key for scaling LLMs, but current open-source models fall short. We introduce SCRIT: a framework with scalable oversight that enables LLMs to self-improve their critique skills✨ We’ve built a pipeline to generate high-quality synthetic critique data

2

9

67

Zhengyang Tang

@zhengyang_42

10 months

📢 Introducing SCRIT: A framework enabling LLMs to self-evolve their critique abilities without human annotations or stronger models. 💡 Key features: • Contrastive self-critic • Mathematical validity check • Zero external supervision 🔗 Paper: https://t.co/0kFnmFu74h

0

7

18

Qingxiu Dong

@qx_dong

1 year

OpenAI o1 scores 94.8% on MATH dataset😲 Then...how should we proceed to track and evaluate the next-gen LLMs' math skills? 👉Omni-Math: a new, challenging benchmark with 4k competition-level problems, where OpenAI o1-mini only achieves 60.54 acc Paper: https://t.co/Qggc7paGwe

10

23

134

Zhengyang Tang

@zhengyang_42

1 year

🚀 Launching ORLM: the first open-source Operations Research LLM, powered by our OR-Instruct process! 🛠️ 🏆 ORLMs achieves SOTA on NL4OPT, MAMO, & the new IndustryOR benchmarks based on different 7b backbones! 📄 Paper: https://t.co/Us5cPGbG5n 💻 Code: https://t.co/T1stsB2dAR

0

4

9

AK

@_akhaliq

2 years

MathScale Scaling Instruction Tuning for Mathematical Reasoning Large language models (LLMs) have demonstrated remarkable capabilities in problem-solving. However, their proficiency in solving mathematical problems remains inadequate.

3

30

114

Aran Komatsuzaki

@arankomatsuzaki

2 years

Microsoft presents GLAN (Generalized Instruction Tuning) Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models GLAN excels without using task-specific training data https://t.co/RZBho2n5i5

8

56

317

arXiv Daily

@Arxiv_Daily

3 years

DPTDR: Deep Prompt Tuning for Dense Passage Retrieval https://t.co/v42EJEeUvP by @zhengyang_42 et al. including @wabyking #NaturalLanguageProcessing #Computation

deepai.org

08/24/22 - Deep prompt tuning (DPT) has gained great success in most natural language processing (NLP) tasks. However, it is not well-invest...

0

2

8