malkin1729
@FelineAutomaton
Followers
172
Following
34
Media
6
Statuses
38
Mathematician/informatician thinking probabilistically, expecting the same of you ‘Tis categories in the mind and guns in their hands which keep us enslaved &🦋
Edinburgh, Scotland
Joined September 2024
New preprint: How can we use latent-reasoning when initial model performance is low? We introduce LiteReason, a simple and lightweight framework that combines latent reasoning _with RL_ to reason efficiently both during and after training while retaining performance gains! 🧵
1
4
18
One of our three papers in “Frontiers in Probabilistic Inference” @ NeurIPS’25, along with https://t.co/vPc0AZpgRo and https://t.co/9LNwevDOWN. Pleasure to work with the brilliant @ktamogashev on all of them!
1/ Can we efficiently learn the destruction process of diffusion samplers? Can we learn not just the drift, but also the variance for all transition kernels? – We answer YES in our recent paper “Adaptive Destruction Processes for Diffusion Samplers” (Oral at NeurIPS 2025 FPI
1
1
7
(1/n) The usual assumption in GFlowNet environments is acyclicity. Have you ever wondered if it can be relaxed? Does the existing GFlowNet theory translate to the non-acyclic case? Is efficient training possible? We shed new light on these questions in our latest work! @icmlconf
1
6
13
1/ 💻 Queer in AI is hosting a social at #ICML2025 in Vancouver on 📅 July 16, and you’re invited! Let’s network, enjoy food and drinks, and celebrate our community. Details below…
1
5
8
An oasis of inclusive science and solidarity amid the monotonically increasing NeurIPS madness that I'm proud to be supporting in a small role this year.
🏳️🌈 Queer in AI is thrilled to announce another season of our affinity workshop at #NeurIPS2025! We announce a Call for Contributions to the workshop, with visa-friendly submissions due by 📅 July 31, 2025, all other submissions due by 📅 August 14, 2025. #QueerInAI #CallForPapers
1
1
5
A great pleasure to crash two Bayesian statistics conferences with a dose of diffusion wisdom — last week in Singapore ( https://t.co/1i4ChFyQtb), now in Cambridge ( https://t.co/3ZC21zoNIR) — with the two authors of this very nice paper.
newton.ac.uk
This workshop focuses on leveraging modern machine learning to accelerate statistical inference, experimental design, and scientific discovery. It features...
🚨 New paper: “Towards Adaptive Self-Normalized IS” TLDR; To estimate µ = E_p[f(θ)] when p(θ) has intractable partition, instead of doing MCMC on p(θ) or learning a parametric q(θ), we try MCMC directly on p(θ)| f(θ)-µ | - variance-minimizing proposal. https://t.co/CuK1dSA98w
0
3
10
Great paper by @siddarthv66, @mh_steps, et al. on amortised inference in latent spaces of generative models, generalising our past work ( https://t.co/QtWRZxUDBy). Useful for alignment, planning in latent space, inference in probabilistic programs?
arxiv.org
Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior...
Is there a universal strategy to turn any generative model—GANs, VAEs, diffusion models, or flows—into a conditional sampler, or finetuned to optimize a reward function? Yes! Outsourced Diffusion Sampling (ODS) accepted to @icmlconf , does exactly that!
1
1
24
Ecstatic to show off some work my brilliant colleagues and I did at @iclr_conf this year! 🚀 We address the credit assignment challenge under long trajectories in RL or GFlowNets by constructing high order actions, or “chunks”, effectively compressing trajectory lengths!
1
15
58
🚀 New Preprint! 🚀 In-Context Parametric Inference: Point or Distribution Estimators? Thrilled to share our work on inferring probabilistic model parameters explicitly conditioned on data, in collab with @Yoshua_Bengio, @FelineAutomaton & @g_lajoie_! 🔗 https://t.co/nF5spoihXN
arxiv.org
Bayesian and frequentist inference are two fundamental paradigms in statistical estimation. Bayesian methods treat hypotheses as random variables, incorporating priors and updating beliefs via...
1
8
21
My PhD thesis entitled "Generative Flow Networks: Theory and Applications to Structure Learning" is now available on Arxiv 🎓 📖 https://t.co/9pAAfp8GEF 🔖 Want to learn what GFlowNets are? Check out Chapters 2, 3 & 4!
arxiv.org
Without any assumptions about data generation, multiple causal models may explain our observations equally well. To avoid selecting a single arbitrary model that could result in unsafe decisions...
This week I successfully defended my PhD! 🎓🎊 Many thanks to my committee @dhanya_sridhar @SimonLacosteJ @sirbayes, and a particularly huge thanks to my advisor @Yoshua_Bengio for his incredible support throughout my PhD.
13
76
498
Happy to share one of my last works! If you are interested in diffusion samplers, please take a look🙃! Many thanks for all my colleagues for their intensive work and fruitful collaboration, especially for @FelineAutomaton for leading this project! Stay tuned for the future ones!
Happy to share our latest work on #diffusion models without data: building theoretical bridges between existing methods, analysing their continuous-time asymptotics, and showing some cool practical implications. https://t.co/uZtV9Hjbx8
#MachineLearning 1/9
0
6
21
This delightful collaboration built upon my past work with @MarcinSendera @jarridrb ( https://t.co/YfuPdi3bMK,
https://t.co/QtWRZxVbr6) and that of the brilliant @julberner and @lorenz_richter ( https://t.co/Yhfm1mMep6,
https://t.co/V5eC3hNGq7). Thanks to all! 9/9
0
0
5
Have a look at our code here. Multiple objectives, exploration strategies, and time discretisation strategies are implemented in a common framework. https://t.co/cKgFBz0zFA 8/9
1
0
5
We are eager to see extensions of this work to non-Markovian sequential generation, discrete state spaces, and posterior sampling under diffusion priors, as well as discretisation error and generalisation bounds. Numerical analysis and stochastic calculus are key tools here! 7/9
1
0
5
The fact that these objectives are well-behaved asymptotically justifies the use of coarser (perhaps non-uniform) time discretisations during training than during sampling. This leads to greatly improved sample efficiency and even allows the use of time-local objectives. 6/9
1
0
5
But all of these objectives run in a time discretisation. What happens when we take the step size to zero? For each objective, we prove that the limit is a well-understood continuous-time object. For example, the detailed balance condition approaches the Fokker-Planck PDE... 5/9
1
0
5
Such time-reversal can be enforced using a zoo of objectives computed through differentiable simulation (connected to stochastic control) or off-policy divergences (connected to entropy-regularised RL). We show connections among these objectives. 4/9
1
0
5