
Yun Qu
@quyun52425662
Followers
15
Following
41
Media
7
Statuses
16
Ph.D. student at Tsinghua University
Joined March 2022
RT @dongxi_nlp: 「 Prompt Difficulty Estimation,Tsinghua 」. Can Prompt Difficulty be Online Predicted for Accelerating RL Finetuning of Reas….
0
9
0
RT @AlbertW24045555: #LargeReasoningModel Can prompt difficulty be predicted online to accelerate RL finetuning of Reasoning Models? YES!….
0
2
0
RT @AlbertW24045555: @dongxi_nlp 马哥,我们清华课题组的早期工作model predictive task sampling已经提出了这个思路,欢迎关注还把这个思路用在了强化学习中,见ICML202….
arxiv.org
Task robust adaptation is a long-standing pursuit in sequential decision-making. Some risk-averse strategies, e.g., the conditional value-at-risk principle, are incorporated in domain...
0
3
0
RT @dongxi_nlp: 清华智能决策课题组工作:. Model Predictive Task Sampling for Efficient and Robust Adaptation. 把 “风险建模 + 主动采样” 用于跨任务自适应的轻量级框架. 现实系统(机器人、….
0
4
0
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments.arxiv:
arxiv.org
Task robust adaptation is a long-standing pursuit in sequential decision-making. Some risk-averse strategies, e.g., the conditional value-at-risk principle, are incorporated in domain...
🥳Just accepted at #ICML2025: PDTS for fast and robust decision-making!💪.🔓Unlocks the potential of robust active task sampling .🎯Boosts zero-shot & few-shot adaptation robustness.⚡️Plug-and-Play, low-cost.🚀Accelerates learning. Project website👉
0
0
2
🥳Just accepted at #ICML2025: PDTS for fast and robust decision-making!💪.🔓Unlocks the potential of robust active task sampling .🎯Boosts zero-shot & few-shot adaptation robustness.⚡️Plug-and-Play, low-cost.🚀Accelerates learning. Project website👉
1
0
5
RT @AlbertW24045555: Not Limited to #DeepSeek: Two years after finishing my PhD at VAE's birthplace, AMLab, I'm thrilled to share a VAE-ins….
0
5
0
RT @AlbertW24045555: #AAAI2025 #LLM4RL Sparse feedback and reward design are lasting challenges in the RL field. Will large models help add….
0
1
0
RT @AlbertW24045555: #MultitaskLearning.Feel free to access the latest SOTA method in Multi-task Optimization. In this work, "GO4Align: gro….
0
1
0