Statistics Papers
@StatsPapers
Followers
7K
Following
0
Media
0
Statuses
57K
New Statistics submissions to https://t.co/HHqPequzVU (not affiliated with https://t.co/HHqPequzVU)
Worldwide
Joined April 2010
Towards a pretrained deep learning estimator of the Linfoot informational correlation.
arxiv.org
We develop a supervised deep-learning approach to estimate mutual information between two continuous random variables. As labels, we use the Linfoot informational correlation, a transformation of...
0
0
4
Hellinger loss function for Generative Adversarial Networks.
arxiv.org
We propose Hellinger-type loss functions for training Generative Adversarial Networks (GANs), motivated by the boundedness, symmetry, and robustness properties of the Hellinger distance. We define...
0
0
4
Interval Fisher's Discriminant Analysis and Visualisation.
arxiv.org
In Data Science, entities are typically represented by single valued measurements. Symbolic Data Analysis extends this framework to more complex structures, such as intervals and histograms, that...
0
0
1
Safely Learning Controlled Stochastic Dynamics.
arxiv.org
We address the problem of safely learning controlled stochastic dynamics from discrete-time trajectory observations, ensuring system trajectories remain within predefined safe regions during both...
0
0
6
Softmax as Linear Attention in the Large-Prompt Regime: a Measure-based Perspective.
arxiv.org
Softmax attention is a central component of transformer architectures, yet its nonlinear structure poses significant challenges for theoretical analysis. We develop a unified, measure-based...
0
0
8
Autotune: fast, accurate, and automatic tuning parameter selection for Lasso.
arxiv.org
Least absolute shrinkage and selection operator (Lasso), a popular method for high-dimensional regression, is now used widely for estimating high-dimensional time series models such as the vector...
0
0
8
Contrastive Time Series Forecasting with Anomalies.
arxiv.org
Time series forecasting predicts future values from past data. In real-world settings, some anomalous events have lasting effects and influence the forecast, while others are short-lived and...
0
0
3
Hyperbolic Gaussian Blurring Mean Shift: A Statistical Mode-Seeking Framework for Clustering in Curved Spaces.
arxiv.org
Clustering is a fundamental unsupervised learning task for uncovering patterns in data. While Gaussian Blurring Mean Shift (GBMS) has proven effective for identifying arbitrarily shaped clusters...
0
2
7
Causal Judge Evaluation: Calibrated Surrogate Metrics for LLM Systems.
arxiv.org
LLM-as-judge evaluation has become the de facto standard for scaling model assessment, but the practice is statistically unsound: uncalibrated scores can invert preferences, naive confidence...
0
0
7
Autotune: fast, accurate, and automatic tuning parameter selection for LASSO.
arxiv.org
Least absolute shrinkage and selection operator (Lasso), a popular method for high-dimensional regression, is now used widely for estimating high-dimensional time series models such as the vector...
0
1
2
Conditional Coverage Diagnostics for Conformal Prediction.
arxiv.org
Evaluating conditional coverage remains one of the most persistent challenges in assessing the reliability of predictive systems. Although conformal methods can give guarantees on marginal...
0
1
13
Data-Driven Model Reduction using WeldNet: Windowed Encoders for Learning Dynamics.
arxiv.org
Many problems in science and engineering involve time-dependent, high dimensional datasets arising from complex physical processes, which are costly to simulate. In this work, we propose WeldNet:...
0
1
7
TPV: Parameter Perturbations Through the Lens of Test Prediction Variance.
arxiv.org
We identify test prediction variance (TPV) -- the first-order sensitivity of model outputs to parameter perturbations around a trained solution -- as a unifying quantity that links several...
0
0
3
Provable Recovery of Locally Important Signed Features and Interactions from Random Forest.
arxiv.org
Feature and Interaction Importance (FII) methods are essential in supervised learning for assessing the relevance of input variables and their interactions in complex prediction models. In many...
0
2
11
An Efficient Variant of One-Class SVM with Lifelong Online Learning Guarantees.
arxiv.org
We study outlier (a.k.a., anomaly) detection for single-pass non-stationary streaming data. In the well-studied offline or batch outlier detection problem, traditional methods such as kernel...
0
0
8
STARK denoises spatial transcriptomics images via adaptive regularization.
arxiv.org
We present an approach to denoising spatial transcriptomics images that is particularly effective for uncovering cell identities in the regime of ultra-low sequencing depths, and also allows for...
0
0
5
Sublinear Variational Optimization of Gaussian Mixture Models with Millions to Billions of Parameters.
arxiv.org
Gaussian Mixture Models (GMMs) range among the most frequently used models in machine learning. However, training large, general GMMs becomes computationally prohibitive for datasets that have...
0
1
9
Beyond Log-Concavity and Score Regularity: Improved Convergence Bounds for Score-Based Generative Models in W2-distance.
arxiv.org
Score-based Generative Models (SGMs) aim to sample from a target distribution by learning score functions using samples perturbed by Gaussian noise. Existing convergence bounds for SGMs in the...
0
2
19