David Selby @davidselby.bsky.social @TeaStats X Profile

David Selby @davidselby.bsky.social

@TeaStats

Followers

405

Following

363

Media

140

Statuses

463

Enthusiastic about tea, statistics and t-statistics. Researcher in Data Science & its Applications @DFKI, honorary @CfE_UoM @PARADISE_AI. #Rstats evangelist

Kaiserslautern 🇩🇪

Joined June 2014

Don't wanna be here? Send us removal request.

David Selby @davidselby.bsky.social

@TeaStats

8 months

I am now on Bluesky

0

David Selby @davidselby.bsky.social

@TeaStats

12 days

🧬BioDisco, an open-source biomedical hypothesis generator, uses agentic LLMs, knowledge graphs and literature search, with an iterative self-evaluation loop, significantly outperforming other architectures. Preprint:

arxiv.org

Identifying novel hypotheses is essential to scientific research, yet this process risks being overwhelmed by the sheer volume and complexity of available information. Existing automated methods...

0

1

David Selby @davidselby.bsky.social

@TeaStats

19 days

New: unofficial @quarto_pub template for the upcoming @RealAAAI 2026 conference. Write your submission in Markdown with embedded computations!

0

David Selby @davidselby.bsky.social

@TeaStats

25 days

RT @FrontComputSci: New Research: Visible neural networks for multi-omics integration: a critical review #Frontiers….

frontiersin.org

BackgroundBiomarker discovery and drug response prediction are central to personalized medicine, driving demand for predictive models that also offer biologi...

0

1

0

David Selby @davidselby.bsky.social

@TeaStats

27 days

XAI, AutoML and perverse publishing incentives could create a perfect storm for "X-hacking": In this paper presented at @icmlconf, we describe a new threat to reproducible research and trustworthy AI:

dfki.de

At ICML 2025, DFKI researchers show how AutoML can generate misleading AI explanations - and propose new standards for trustworthy AI.

0

David Selby @davidselby.bsky.social

@TeaStats

27 days

❓What is a "Visible Neural Network"? A new deep learning model for omics, where prior knowledge and interpretability are baked right into the architecture. 🎯 We review dozens of models, datasets & applications, and call for better tools/benchmarks:.

frontiersin.org

BackgroundBiomarker discovery and drug response prediction are central to personalized medicine, driving demand for predictive models that also offer biologi...

0

2

1

David Selby @davidselby.bsky.social

@TeaStats

5 months

RT @strnr: Beyond the black box with biologically informed neural networks (read free: 🧬🖥️….

0

26

0

David Selby @davidselby.bsky.social

@TeaStats

5 months

RT @cwcyau: Health Research From Home Hackathon 2025 |.This hackathon is being held by Health Research From Home Partnership led by the @Of….

health-research-from-home.github.io

7-9 May 2025

0

2

0

David Selby @davidselby.bsky.social

@TeaStats

5 months

Lay abstract for our latest article on retrieving quantitative expert knowledge from LLMs

statisticsviews.com

The lay abstract featured today (for Had Enough of Experts? Quantitative Knowledge Retrieval From Large Language Models by David Selby, Yuichiro Iwashita, Kai Spriestersbach, Mohammad Saad, Dennis...

0

David Selby @davidselby.bsky.social

@TeaStats

5 months

Just published! Can LLMs, having read so much scientific literature, play the role of a human expert and help us fill in missing values and fit statistical models to small data sets? We investigate:.

onlinelibrary.wiley.com

Large language models (LLMs) have been extensively studied for their ability to generate convincing natural language sequences; however, their utility for quantitative information retrieval is less...

0

1

2

David Selby @davidselby.bsky.social

@TeaStats

5 months

New blog post: on learning new English words and meanings in Germany

selbydavid.com

At the railway station, a lost-looking US soldier asked me if I spoke English. Do I? At times it feels like it, but the Germans keep me guessing. Since moving to Germany, I have been continually...

0

David Selby @davidselby.bsky.social

@TeaStats

5 months

New blog post: Alternatives to @overleaf for collaborative and reproducible writing by combining @code with @quarto_pub or #Rstats markdown.

selbydavid.com

Overleaf, formerly known as Share$\LaTeX$, is the go-to collaborative document editor for many researchers, who have taken advantage of its free tier. It’s a web-based editor that compiles $\LaTeX$...

0

David Selby @davidselby.bsky.social

@TeaStats

6 months

Thrilled to share our latest publication in @NatureRevGenet!. We explore how deep learning models infused with prior pathway knowledge — aka 'visible neural networks' — promise better predictive accuracy & interpretability in multi-omics data analysis.

nature.com

Nature Reviews Genetics - Biologically informed neural networks promise to lead to more explainable, data-driven discoveries in genomics, drug development and precision medicine. Selby et al....

0

18

36

David Selby @davidselby.bsky.social

@TeaStats

7 months

Friends don't let friends make dynamite plunger plots.

Samuel Müller

@SamuelMullr

7 months

This might be the first time after 10 years that boosted trees are not the best default choice when working with data in tables. Instead a pre-trained neural network is, the new TabPFN, as we just published in Nature 🎉

1

0

1

David Selby @davidselby.bsky.social

@TeaStats

8 months

Pleased to present our poster at #NeurIPS2024 workshop on Bayesian Decisionmaking and Uncertainty! 🎉 Our work explores using large language models for eliciting expert-informed Bayesian priors. Elicited lots of discussion with community too! Check it out:

0

1

5

David Selby @davidselby.bsky.social

@TeaStats

8 months

New preprint on Visible Neural Networks, a way of integrating prior biological knowledge into machine learning models of omics data to improve interpretability. We review >80 papers to explain what VNNs are, how you build them and how they are evaluated.