fly51fly Profile Banner
fly51fly Profile
fly51fly

@fly51fly

Followers
8K
Following
77K
Media
11K
Statuses
25K

BUPT prof | Sharing latest AI papers & insights | Join me in embracing the AI revolution! #MachineLearning #AI #Innovation

Joined February 2009
Don't wanna be here? Send us removal request.
@fly51fly
fly51fly
16 minutes
[LG] Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning.A Golubev, M Trofimova, S Polezhaev, I Badertdinov. [Nebius AI] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
1
@fly51fly
fly51fly
31 minutes
[LG] Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction.Y Lin, S Tang, B Lyu, Z Yang. [Princeton University & Tsinghua University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
0
1
@fly51fly
fly51fly
41 minutes
[CL] CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward.S Liu, H Liu, J Liu, L Xiao. [Shanghai AI Laboratory] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
2
@fly51fly
fly51fly
57 minutes
[LG] Perch 2.0: The Bittern Lesson for Bioacoustics.B v Merriënboer, V Dumoulin, J Hamer, L Harrell. [Google DeepMind] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
2
1
@fly51fly
fly51fly
1 hour
[CL] A comprehensive taxonomy of hallucinations in Large Language Models.M Cossio [Universitat de Barcelona] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
3
@fly51fly
fly51fly
1 day
[LG] Decomposing Representation Space into Interpretable Subspaces with Unsupervised Learning.X Huang, M Hahn [Saarland University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
6
19
@fly51fly
fly51fly
1 day
[CL] R-Zero: Self-Evolving Reasoning LLM from Zero Data.C Huang, W Yu, X Wang, H Zhang. [Tencent AI Seattle Lab] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
3
9
@fly51fly
fly51fly
1 day
[CL] Multi-module GRPO: Composing Policy Gradients and Prompt Optimization for Language Model Programs.N Ziems, D Soylu, L A Agrawal, I Miller. [University of Notre Dame & Stanford University & UC Berkeley] (2025).
Tweet media one
Tweet media two
0
5
14
@fly51fly
fly51fly
1 day
[CL] Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference.Y Song, Z Zhang, C Luo, P Gao. [ByteDance Seed & Tsinghua University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
6
19
@fly51fly
fly51fly
1 day
[LG] Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens.C Zhao, Z Tan, P Ma, D Li. [Arizona State University,] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
3
27
@fly51fly
fly51fly
2 days
[CL] Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression.J Huang, B Lin, G Feng, J Chen. [Peking University & The Hong Kong University of Science and Technology] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
2
8
@fly51fly
fly51fly
2 days
[LG] GRAIL:Learning to Interact with Large Knowledge Graphs for Retrieval Augmented Reasoning.G Chang, J Su, J Liu, P Yang. [Tsinghua University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
3
10
@fly51fly
fly51fly
2 days
[LG] On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification.Y Wu, Y Zhou, Z Ziheng, Y Peng. [Southeast University & University of California, Los Angeles] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
15
85
@fly51fly
fly51fly
2 days
[CL] MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy.S Zhan, Y Lai, Z Lu, D Lin. [Tsinghua University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
2
13
@fly51fly
fly51fly
2 days
[CL] Learning to Reason for Factuality.X Chen, I Kulikov, V Berges, B Oğuz. [FAIR at Meta] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
27
@fly51fly
fly51fly
3 days
[LG] OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use .
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
5
16
@fly51fly
fly51fly
3 days
[CL] Sotopia-RL: Reward Design for Social Intelligence.H Yu, Z Qi, Y Zhao, K Nottingham. [University of Illinois Urbana-Champaign & CMU] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
11
@fly51fly
fly51fly
3 days
[CL] CoAct-1: Computer-using Agents with Coding as Actions.L Song, Y Dai, V Prabhu, J Zhang. [University of Southern California & Salesforce Research] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
1
5
@fly51fly
fly51fly
3 days
[AS] Live Music Models.L T A Caillon, B McWilliams, C Tarakajian, I Simon. [Google DeepMind] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
8
@fly51fly
fly51fly
3 days
[CL] Sculptor: Empowering LLMs with Cognitive Agency via Active Context Management.M Li, L.H. Xu, Q Tan, T Cao. [Tsinghua University] (2025).
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
10