ML_BenWalker Profile Banner
Ben Walker Profile
Ben Walker

@ML_BenWalker

Followers
95
Following
59
Media
34
Statuses
110

ML PhD @OxUniMaths. Researching Neural DEs and the theory of rough paths. Email: [email protected]

University of Oxford
Joined February 2022
Don't wanna be here? Send us removal request.
@ML_BenWalker
Ben Walker
19 days
1/ Excited to announce that our paper on Structured Linear CDEs (SLiCEs) is a NeurIPS 2025 spotlight! TLDR: Diagonal state-transition matrices (Mamba) are efficient but not expressive. Dense ones are expressive but costly. Structured matrices give efficient maximal expressivity.
2
1
3
@ML_BenWalker
Ben Walker
4 days
Poster Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning Wed Dec 3, 2025 • 11:00 AM to 2:00 PM PST Exhibit Hall C,D,E #3919 Torben Berndt, Benjamin Walker, Tiexin Qin, Jan Stühmer, Andrey Kormilitzin
0
0
0
@ML_BenWalker
Ben Walker
4 days
Spotlight Poster Structured Linear CDEs: Maximally Expressive and Parallel in Time Sequence Models Thu Dec 4, 2025 • 4:30 PM to 7:30 PM PST Exhibit Hall C,D,E #3909 Benjamin Walker, Lingyi Yang, Nicola Muca Cirone, Cristopher Salvi, Terry Lyons
1
0
0
@ML_BenWalker
Ben Walker
4 days
I am at NeurIPS 2025! I would love to meet people interested in sequence models, neural differential equations, or dynamic graphs. Feel free to reach out if you want to chat, or come find me at one of my posters, details below!
1
0
1
@ML_BenWalker
Ben Walker
19 days
9/ Huge thanks to my coauthors Lingyi Yang, @MucaCirone, Cris Salvi, and Terry Lyons. I greatly enjoyed working on this paper together.
0
0
0
@ML_BenWalker
Ben Walker
19 days
8/ SLiCEs also set a new state of the art among parallel-in-time models on the regular language tasks from the formal language benchmark. For more details, check out the paper and code. Paper: https://t.co/Il9ZUJFT64 Code:
Tweet card summary image
github.com
Code for "Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models" (NeurIPS 2025, Spotlight) - Benjamin-Walker/structured-linear-cdes
1
0
0
@ML_BenWalker
Ben Walker
19 days
7/ This is more than just theory. On permutation composition, LSTM performs well, Mamba struggles, and Dense LNCDEs generalise strongly. Furthermore, SLiCEs match dense performance. They are the only parallel-in-time models that generalise beyond the validation sequence length.
1
0
0
@ML_BenWalker
Ben Walker
19 days
6/ SLiCEs fix this by replacing diagonal matrices with structured ones that still allow mixing. We prove that block-diagonal, sparse, Walsh Hadamard and diagonal plus low rank variants all achieve maximal expressivity while staying parallel-in-time.
1
0
0
@ML_BenWalker
Ben Walker
19 days
5/ Our 2024 NeurIPS paper showed that diagonal state-transition matrices are not maximally expressive, while dense matrices are. The challenge is that dense matrices are expensive.
1
0
0
@ML_BenWalker
Ben Walker
19 days
4/ Mamba uses input-dependent state-transition matrices, keeping parallel-in-time computation while adding expressivity. However, the matrices are diagonal, preventing hidden state mixing. It is like trying to understand an orchestra while hearing each instrument in isolation.
1
0
0
@ML_BenWalker
Ben Walker
19 days
3/ Classical RNNs solve these tasks easily, but their nonlinear recurrences cannot be computed exactly in parallel. Linear RNNs can be parallelised, but they lack the expressive power needed for state-tracking.
1
0
0
@ML_BenWalker
Ben Walker
19 days
2/ Parallel-in-time architectures such as Transformers have enabled sequence models to scale to billions of parameters, but empirically they struggle on state-tracking tasks like modular arithmetic and permutation composition.
1
0
0
@ML_BenWalker
Ben Walker
24 days
One nice part of writing up a thesis is getting to step back and see the combined impact of the last four years of work
0
0
0
@ArthurB
Arthur B.
1 month
14
13
452
@finbarrtimbers
finbarr
1 month
I miss PapersWithCode
39
45
735
@ML_BenWalker
Ben Walker
1 month
Ran a simple check. Standard LSTM vs diagonal state-transition LSTM trained on a regular language (cycle navigation) and evaluated on length generalisation. Removing hidden-state mixing dropped validation accuracy from 100% to ~40%.
@ML_BenWalker
Ben Walker
1 month
Interesting paper on parallelising non-linear RNNs using a parallel Newton solve. However, to make it feasible they used diagonal state-transition matrices, preventing any hidden state mixing. Feels likely this negates the expressivity gains of using non-linearities.
0
0
0
@ML_BenWalker
Ben Walker
1 month
Would have been interesting to see some comparisons with a normal LSTM on state-tracking benchmarks to understand the impact of using diagonal matrices. Paper link:
0
0
0
@ML_BenWalker
Ben Walker
1 month
Interesting paper on parallelising non-linear RNNs using a parallel Newton solve. However, to make it feasible they used diagonal state-transition matrices, preventing any hidden state mixing. Feels likely this negates the expressivity gains of using non-linearities.
1
0
1
@ML_BenWalker
Ben Walker
1 month
Thrilled to share that our follow up paper on permutation equivariant Graph Neural CDEs will be presented at NeurIPS 2025 🎉 Adding permutation equivariance gives strong empirical performance with significantly fewer parameters.
@ML_BenWalker
Ben Walker
2 months
🎉 Excited to share our new TPAMI paper: “Learning Dynamic Graph Embeddings with NCDEs” We introduce Graph NCDEs, a continuous-time model for dynamic graphs. Unlike models that combine a GNN with a time-series model, we directly model evolving graph dynamics. Read the paper 👇
1
0
0