Xinyu Yang Profile
Xinyu Yang

@Xinyu2ML

Followers
1K
Following
1K
Media
32
Statuses
412

Ph.D. @CarnegieMellon. Working on agentic foundation model systems. Founder of the FM-Wild workshop series and the ASAP seminar series. They/Them

Pittsburgh, US
Joined December 2022
Don't wanna be here? Send us removal request.
@Xinyu2ML
Xinyu Yang
5 months
🚀 Super excited to share Multiverse! 🏃 It’s been a long journey exploring the space between model design and hardware efficiency. What excites me most is realizing that, beyond optimizing existing models, we can discover better model architectures by embracing system-level
@InfiniAILab
Infini-AI-Lab
5 months
🔥 We introduce Multiverse, a new generative modeling framework for adaptive and lossless parallel generation. 🚀 Multiverse is the first open-source non-AR model to achieve AIME24 and AIME25 scores of 54% and 46% 🌐 Website: https://t.co/J9osByhWUf 🧵 1/n
3
19
72
@InfiniAILab
Infini-AI-Lab
20 days
🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents
2
9
21
@rohanpaul_ai
Rohan Paul
7 days
🏗️ Hardware Memory bandwidth is becoming the choke point slowing down GenAI. During 2018–2022, transformer model size grew ~410× every 2 years, while memory per accelerator grew only about 2× every 2 years. And that mismatch shoves us into a “Memory-Wall” The "memory wall" is
31
108
495
@BeidiChen
Beidi Chen
5 days
😲
@Kimi_Moonshot
Kimi.ai
6 days
🚀 Hello, Kimi K2 Thinking! The Open-Source Thinking Agent Model is here. 🔹 SOTA on HLE (44.9%) and BrowseComp (60.2%) 🔹 Executes up to 200 – 300 sequential tool calls without human interference 🔹 Excels in reasoning, agentic search, and coding 🔹 256K context window Built
0
2
48
@Xinyu2ML
Xinyu Yang
6 days
A rare chance for a tech person to learn art, haha. Honored to join this panel at #AMIF2025. 🎶🤖
@EvelyneXSLi
Xiaosha Evelyne Li
6 days
Excited to announce our upcoming panel — “AI + Music: Empowering, Not Overpowering” at the Asian Music Industry Festival (AMIF) in Boston, Nov 15–16. AI academia <-+-> music industry - how AI can elevate creativity without replacing human artistry. We’re honored to feature: -
2
0
6
@xxunhuang
Xun Huang
7 days
We present MotionStream — real-time, long-duration video generation that you can interactively control just by dragging your mouse. All videos here are raw, real-time screen captures without any post-processing. Model runs on a single H100 at 29 FPS and 0.4s latency.
37
149
1K
@chenxiao_yang_
Chenxiao Yang
8 days
How powerful are Diffusion LLMs? Can they solve problems that Auto-Regressive (AR) LLMs can’t solve? Check our new paper "On Powerful Ways to Generate: Autoregression, Diffusion, and Beyond" 🔗 https://t.co/aiGTbXMWFE In this work, we show while Diffusion LLMs are indeed more
16
77
371
@RidgerZhu
Rui-Jie (Ridger) Zhu
13 days
Thrilled to release new paper: “Scaling Latent Reasoning via Looped Language Models.” TLDR: We scale up loop language models to 2.6 billion parameters, and pretrained on > 7 trillion tokens. The resulting model is on par with SOTA language models of 2 to 3x size.
20
137
627
@yueqi_song
Yueqi Song @ EMNLP2025
14 days
We just built and released the largest dataset for supervised fine-tuning of agentic LMs, 1.27M trajectories (~36B tokens)! Up until now, large-scale SFT for agents is rare - not for lack of data, but because of fragmentation across heterogeneous formats, tools, and interfaces.
Tweet card summary image
arxiv.org
Public research results on large-scale supervised finetuning of AI agents remain relatively rare, since the collection of agent training data presents unique challenges. In this work, we argue...
27
172
1K
@SonglinYang4
Songlin Yang
12 days
Many people are confused by Minimax’s recent return to full attention - especially since it was the first large-scale pivot toward hybrid linear attention - and by Kimi’s later adoption of hybrid linear variants (as well as earlier attempts by Qwen3-Next, or Qwen3.5). I actually
12
64
506
@SonglinYang4
Songlin Yang
15 days
it’s an improved version of Gated DeltaNet. enjoy ^^
@eliebakouch
elie
15 days
Kimi Delta Attention PR in FLA, very nice @yzhang_cs and team, i'm sooo excited for this model
5
15
202
@LeYangco
Yang Li
16 days
🚀 Happy to present our new work on LLM reasoning! We show that: (1) Attention is a structured map of the model's reasoning logic, uncovering a preplan-and-anchor reasoning rhythm. (2) Aligning RL objectives with the model's intrinsic attention rhythm yields more transparent,
9
45
229
@SimonXinDong
X. Dong
17 days
We, at NVIDIA, presents - Length Penalty Done Right - Cut CoT length by 3/4 without sacrificing accuracy using only RL - This makes DeepSeek-R1-7B running ~8 times faster on AIME-24 while maintaining the same accuracy.
8
29
248
@JustinLin610
Junyang Lin
19 days
today i had a talk in hkust gz, one friend asked me how come we can make the bet on scaling linear attention. my answer is more about the culture that i have been trying to make. admittedly it is too hard to change the mechanism which always rewards visible contribution and
27
28
552
@simonguozirui
Simon Guo
18 days
Wrote a 1-year retrospective with @a1zhang on KernelBench and the journey toward automated GPU/CUDA kernel generations! Since my labmates (@anneouyang, @simran_s_arora, @_williamhu) and I first started working towards this vision around last year’s @GPU_mode hackathon, we have
10
61
288
@tydsh
Yuandong Tian
20 days
@daemonzhang6 That's the problem. People who are responsible for the issues are not the people who got laid off😅 In January, our team put down all the research we are currently doing, was (forced?) to move to GenAI <2 months before the llama 4 release deadline to help with all the
9
39
827
@Xinyu2ML
Xinyu Yang
20 days
⚠️Humans and AIs may write the same code that passes the same unit test, but “safety” isn’t symmetric. 🧑‍🦱 For humans, "Correct" ≈ "Safe": With accountability, they should avoid writing warnable but passing code. 🤖 For agents: "Correct" ≠ "Safe": Without responsibility,
@InfiniAILab
Infini-AI-Lab
20 days
🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents
0
0
11
@BeidiChen
Beidi Chen
20 days
📣 we study a threat model that users intent to leverage llm agent to fix problems in the code base but the agent could just insert vulnerabilities in while passes all the tests — I think security would be a more and more important problem when agents ability grows. So much fun
@InfiniAILab
Infini-AI-Lab
20 days
🚀If your code agent generates a patch that passes all tests, should you trust it merge automatically? ⚠️You probably shouldn’t! “Correct” ≠ “Safe.” In our study we show that a single normal looking issue description, whether from a benign user or not, can lead code agents
0
3
30
@tydsh
Yuandong Tian
20 days
Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
474
282
7K
@stochasticchasm
stochasm
22 days
You can just train things
@pli_cachete
Rota
22 days
Pack it in boys
5
20
247