Tim Xiao Profile
Tim Xiao

@TimZXiao

Followers
253
Following
1K
Media
36
Statuses
307

PhD student in Machine Learning @ University of Tübingen · IMPRS-IS scholar

Joined June 2012
Don't wanna be here? Send us removal request.
@TimZXiao
Tim Xiao
5 months
✨ New paper: Flipping Against All Odds We found that large language models (LLMs) can describe probabilities—but fail to sample from them faithfully. Yes, even flipping a fair coin is hard. 🪙 🧵 Here’s what we learned—and how we fixed it. 🔗 https://t.co/Auw7agOws3 1/
4
8
16
@Besteuler
Weiyang Liu
14 days
🤯 Merging many finetuned LLMs into one model, effectively? Introducing Functional Dual Anchor (FDA), a new framework for model merging. 🚀 Current merging works poorly due to the underlying parameter conflicts. FDA shifts knowledge integration to the input-representation space
10
96
613
@Besteuler
Weiyang Liu
18 days
The physics prior matters in molecular structures. We model potential energy between molecules for drug design. This happens to have a coincident yet interesting connection to my past work, hyperspherical energy ( https://t.co/aRJSgn3gaE), which considers potential energy between
0
2
17
@Maze_s_Center
Center of The Maze
22 days
@kenneth0stanley @ai_bread This prompt baking reminds me of verbalized machine learning. Though they don't modify the weights, but update the parameters. https://t.co/xqHDuWkY7f
@TimZXiao
Tim Xiao
1 year
Verbalized Machine Learning (VML) moves machine learning into natural language space, where one learns a model parameterized by natural language using LLMs. How does VML connect LLMs with: Universal function approximator? von Neumann architecture? Interpretable learning? How
0
1
2
@a_kzna
Anna Kuzina
23 days
Polymer simulations, but make them Vivace ⚡ It was a pleasure to work on Vivace architecture during my time in @MSFTResearch together with Lixin Sun and @gncsimm .
@gncsimm
Gregor Simm
24 days
MLFFs 🤝 Polymers — SimPoly works! Our team at @MSFTResearch AI for Science is proud to present SimPoly (SIM-puh-lee) — a deep learning solution for polymer simulation. Polymeric materials are foundational to modern life—found in everything from the clothes we wear and the food
0
1
11
@Besteuler
Weiyang Liu
23 days
This is almost a year-long project and led by @ItsTheZhen. My biggest takeaway is that physical simulation is very effective as a reward signal, and this efficient verification is crucial for unlocking LLMs’ design novelty. This conclusion is actually aligned with our previous
@ItsTheZhen
Zhen Liu
23 days
Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble
0
5
33
@TimZXiao
Tim Xiao
23 days
Sharing a fascinating work: BesiegeField. It explores how LLMs can think and design directly in the space of natural language — a meaningful and fitting challenge for LLMs. A great example of verbalized computing, where design goals are defined in words rather than formal specs.
@ItsTheZhen
Zhen Liu
23 days
Can LLMs design real machines — from 🚗 cars to 🏹 catapults? Can they engineer through both 🧠 agentic workflows and 🌀 reinforcement learning (RL) — learning from physical simulation instead of text alone? We treat machine design as “machine code writing”, where LLMs assemble
0
0
4
@Besteuler
Weiyang Liu
23 days
🤖 Can LLMs learn to create? Introducing "Agentic Design of Compositional Machines" — a new frontier where AI builds functional machines from standardized parts. We present BesiegeField, a simulation testbed to benchmark LLMs on tasks like building cars & catapults. Key
1
3
14
@ItsTheZhen
Zhen Liu
24 days
TL;DR: Meet BesiegeField—a playground where LLMs build, test, and refine machines from standard parts in real time. We tested agentic workflows and RLVR with top LLMs: even the strongest still show limits in compositional machine design. 🔗 https://t.co/V4GQx2KB8q 🧵 below
@ItsTheZhen
Zhen Liu
24 days
Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?
0
3
7
@ItsTheZhen
Zhen Liu
24 days
Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?
2
3
12
@Besteuler
Weiyang Liu
24 days
This is a wonderful collaboration with @ItsTheZhen and Wenqian. I’ve long been curious whether large language models truly possess creativity -- the ability to build something genuinely novel. This project represents our first step toward answering that question. It also aligns
@ItsTheZhen
Zhen Liu
24 days
Human history is marked by the machines we created: from the Antikythera mechanism of ancient Greece, to the imaginations of the Renaissance, to the engines of the steam era. We wonder: can LLMs, like humans, build sophisticated machines to achieve purposeful functionality?
0
2
9
@Besteuler
Weiyang Liu
24 days
🚀 Glad to introduce SimKO (Simple Pass@K Optimization) Current GRPO-based methods overfit to safe responses -- great Pass@1, poor Pass@K. 🔍 We find this stems from probability over-concentration: the model collapses onto its top-1 token, losing exploration. This appears to be
5
20
159
@EurIPSConf
EurIPS Conference
27 days
The #EurIPS Salon des Refusés poster session is now accepting submissions 📜 We welcome self-nomination of rejected submissions from all @NeurIPSConf tracks (Main, D&B, Position). The call for self-nominations will stay open until the session is filled. https://t.co/w8cu8EAaGs
2
5
8
@karpathy
Andrej Karpathy
28 days
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
665
3K
24K
@Besteuler
Weiyang Liu
1 month
I enjoy reading this blog. This is exactly what I am trying to pursue throughout my research career -- using weight geometry to characterize and improve neural network training. Really excited that it finally got people's attention now!! In 2017, we study the weight
Tweet card summary image
arxiv.org
Convolution as inner product has been the founding basis of convolutional neural networks (CNNs) and the key to end-to-end visual representation learning. Benefiting from deeper architectures,...
@thinkymachines
Thinking Machines
1 month
Efficient training of neural networks is difficult. Our second Connectionism post introduces Modular Manifolds, a theoretical step toward more stable and performant training by co-designing neural net optimizers with manifold constraints on weight matrices.
0
8
45
@jacobyhsi88
Jacob Si
2 months
VUD’s gonna be at #NeurIPS2025 🎉🥳 Special thanks to my labmates that made this collaboration especially enjoyable!
@jacobyhsi88
Jacob Si
2 months
Wanna understand the sources of uncertainty in LLMs when performing in-context learning 🤔? 🚀 We introduce a variational uncertainty decomposition framework for in-context learning without explicitly sampling from the latent parameter posterior. 📄 Paper:
0
5
32
@Besteuler
Weiyang Liu
2 months
🥳POET is accepted to #NeurIPS2025!
@Besteuler
Weiyang Liu
5 months
📢Glad to introduce our paper: Reparameterized LLM Training via Orthogonal Equivalence Transformation (POET)! POET is a new algorithm for efficiently pretraining / finetuning large language models. Its training consists of three geometric phases. https://t.co/g5TzYGOhSE 1/6
0
3
20
@vllm_project
vLLM
2 months
The amazing blogpost from @gordic_aleksa is alive at vLLM's blogpost https://t.co/dI8NSyd3t3 (after more proofreading and clarifications)! Looking forward to future series of tech deep dive blogposts😍
Tweet card summary image
blog.vllm.ai
[!NOTE] Originally posted on Aleksa Gordic’s website.
@gordic_aleksa
Aleksa Gordić (水平问题)
2 months
New in-depth blog post - "Inside vLLM: Anatomy of a High-Throughput LLM Inference System". Probably the most in depth explanation of how LLM inference engines and vLLM in particular work! Took me a while to get this level of understanding of the codebase and then to write up
11
80
635
@Besteuler
Weiyang Liu
2 months
We have been working on enabling LLMs to generate symbolic graphics programs since IG-LLM ( https://t.co/LjD8TdkpOz) and SGP-Bench ( https://t.co/O8deijiCen), but SFT didn't really work at the time. We now find that, with a properly designed cross-modal reward (eg, CLIP), RLVR can
ig-llm.is.tue.mpg.de
@_akhaliq
AK
2 months
Symbolic Graphics Programming with Large Language Models
0
6
16
@liyzhen2
yingzhen
2 months
We show how to make LLM in-context learning approximately Bayesian & decompose uncertainty IMO this is proper approximate inference 🥰 applied to LLMs Led by awesome students @shavindra_j @jacobyhsi88 Filippo & Wenlong 👍 Example👇by prompting, bandits & NLP examples in paper
@StatsPapers
Statistics Papers
2 months
Variational Uncertainty Decomposition for In-Context Learning.
7
19
165
@huanbo_sun
Huanbo Sun
2 months
For tactile sensing, why shear forces are less accurate to detect than normal forces: Isoline-based theory for tactile sensing explains shear force detection at superresolution. (Open Access Link: https://t.co/BA9vNgFj6n)
0
1
5