Ben Walker @ML_BenWalker X Profile

Ben Walker

@ML_BenWalker

Followers

95

Following

59

Media

34

Statuses

110

ML PhD @OxUniMaths. Researching Neural DEs and the theory of rough paths. Email: [email protected]

https://t.co/5Fg65ng7H8

University of Oxford

Joined February 2022

Don't wanna be here? Send us removal request.

Ben Walker

@ML_BenWalker

19 days

1/ Excited to announce that our paper on Structured Linear CDEs (SLiCEs) is a NeurIPS 2025 spotlight! TLDR: Diagonal state-transition matrices (Mamba) are efficient but not expressive. Dense ones are expressive but costly. Structured matrices give efficient maximal expressivity.

2

1

3

Ben Walker

@ML_BenWalker

4 days

Poster Permutation Equivariant Neural Controlled Differential Equations for Dynamic Graph Representation Learning Wed Dec 3, 2025 • 11:00 AM to 2:00 PM PST Exhibit Hall C,D,E #3919 Torben Berndt, Benjamin Walker, Tiexin Qin, Jan Stühmer, Andrey Kormilitzin

0

Ben Walker

@ML_BenWalker

4 days

Spotlight Poster Structured Linear CDEs: Maximally Expressive and Parallel in Time Sequence Models Thu Dec 4, 2025 • 4:30 PM to 7:30 PM PST Exhibit Hall C,D,E #3909 Benjamin Walker, Lingyi Yang, Nicola Muca Cirone, Cristopher Salvi, Terry Lyons

1

0

Ben Walker

@ML_BenWalker

4 days

I am at NeurIPS 2025! I would love to meet people interested in sequence models, neural differential equations, or dynamic graphs. Feel free to reach out if you want to chat, or come find me at one of my posters, details below!

1

0

1

Ben Walker

@ML_BenWalker

19 days

9/ Huge thanks to my coauthors Lingyi Yang, @MucaCirone, Cris Salvi, and Terry Lyons. I greatly enjoyed working on this paper together.

0

Ben Walker

@ML_BenWalker

19 days

8/ SLiCEs also set a new state of the art among parallel-in-time models on the regular language tasks from the formal language benchmark. For more details, check out the paper and code. Paper: https://t.co/Il9ZUJFT64 Code:

github.com

Code for "Structured Linear CDEs: Maximally Expressive and Parallel-in-Time Sequence Models" (NeurIPS 2025, Spotlight) - Benjamin-Walker/structured-linear-cdes

1

0

Ben Walker

@ML_BenWalker

19 days

7/ This is more than just theory. On permutation composition, LSTM performs well, Mamba struggles, and Dense LNCDEs generalise strongly. Furthermore, SLiCEs match dense performance. They are the only parallel-in-time models that generalise beyond the validation sequence length.

1

0

Ben Walker

@ML_BenWalker

19 days

6/ SLiCEs fix this by replacing diagonal matrices with structured ones that still allow mixing. We prove that block-diagonal, sparse, Walsh Hadamard and diagonal plus low rank variants all achieve maximal expressivity while staying parallel-in-time.

1

0

Ben Walker

@ML_BenWalker

19 days

5/ Our 2024 NeurIPS paper showed that diagonal state-transition matrices are not maximally expressive, while dense matrices are. The challenge is that dense matrices are expensive.

1

0

Ben Walker

@ML_BenWalker

19 days

4/ Mamba uses input-dependent state-transition matrices, keeping parallel-in-time computation while adding expressivity. However, the matrices are diagonal, preventing hidden state mixing. It is like trying to understand an orchestra while hearing each instrument in isolation.

1

0

Ben Walker

@ML_BenWalker

19 days

3/ Classical RNNs solve these tasks easily, but their nonlinear recurrences cannot be computed exactly in parallel. Linear RNNs can be parallelised, but they lack the expressive power needed for state-tracking.

1

0

Ben Walker

@ML_BenWalker

19 days

2/ Parallel-in-time architectures such as Transformers have enabled sequence models to scale to billions of parameters, but empirically they struggle on state-tracking tasks like modular arithmetic and permutation composition.

1

0

Ben Walker

@ML_BenWalker

24 days

One nice part of writing up a thesis is getting to step back and see the combined impact of the last four years of work

0

Arthur B.

@ArthurB

1 month

@boazbaraktcs @emollick @EpochAIResearch That's so abelist.

14

13

452

finbarr

@finbarrtimbers

1 month

I miss PapersWithCode

39

45

735

Ben Walker

@ML_BenWalker

1 month

Ran a simple check. Standard LSTM vs diagonal state-transition LSTM trained on a regular language (cycle navigation) and evaluated on length generalisation. Removing hidden-state mixing dropped validation accuracy from 100% to ~40%.

Ben Walker

@ML_BenWalker

1 month

Interesting paper on parallelising non-linear RNNs using a parallel Newton solve. However, to make it feasible they used diagonal state-transition matrices, preventing any hidden state mixing. Feels likely this negates the expressivity gains of using non-linearities.

0

Ben Walker

@ML_BenWalker

1 month

Would have been interesting to see some comparisons with a normal LSTM on state-tracking benchmarks to understand the impact of using diagonal matrices. Paper link:

0

Ben Walker

@ML_BenWalker

1 month

Interesting paper on parallelising non-linear RNNs using a parallel Newton solve. However, to make it feasible they used diagonal state-transition matrices, preventing any hidden state mixing. Feels likely this negates the expressivity gains of using non-linearities.

1

0

1

Ben Walker

@ML_BenWalker

1 month

https://t.co/7ZqzLIsQYI

arxiv.org

Dynamic graphs exhibit complex temporal dynamics due to the interplay between evolving node features and changing network structures. Recently, Graph Neural Controlled Differential Equations...

0

Ben Walker

@ML_BenWalker

1 month

Thrilled to share that our follow up paper on permutation equivariant Graph Neural CDEs will be presented at NeurIPS 2025 🎉 Adding permutation equivariance gives strong empirical performance with significantly fewer parameters.

Ben Walker

@ML_BenWalker

2 months

🎉 Excited to share our new TPAMI paper: “Learning Dynamic Graph Embeddings with NCDEs” We introduce Graph NCDEs, a continuous-time model for dynamic graphs. Unlike models that combine a GNN with a time-series model, we directly model evolving graph dynamics. Read the paper 👇

1

0