Po-Yu Lin @PoYu_Lin_NCKUH X Profile

Po-Yu Lin

@PoYu_Lin_NCKUH

Followers

7

Following

1

Media

10

Statuses

19

Joined September 2025

Don't wanna be here? Send us removal request.

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Huge thanks to @BrandesNadav for his guidance!

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

We expect P-KNN will help improve diagnostic yield in rare disease genetics and make variant interpretation more robust, flexible, and up to date.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

P-KNN is available here: 💻 Command line tool: https://t.co/xx1lkMWGTR 📊 Precomputed scores for all missense variants in the human genome: https://t.co/aa6MLJQyUY Preprint:

biorxiv.org

Clinical guidelines for Mendelian disease diagnosis require that outputs from variant pathogenicity prediction tools be converted into well-calibrated probabilities. However, the existing calibration...

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

In summary, P-KNN 🔹 Flexibly uses any combination of tools - no pre-commitment required 🔹 Delivers stronger, better-calibrated evidence than a calibrated meta-predictor 🔹 Fully compatible with the ACMG/AMP Bayesian framework

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Computational tools and deep mutational scans both provide pathogenicity scores. ⚠️Guidelines treat them as separate evidence and add their log likelihood ratios, but they’re not independent, risking miscalibration. ✅ P‑KNN integrates them into one well‑calibrated probability.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

It’s striking how such a simple idea can be so effective - P-KNN consistently turns diverse tool outputs into stronger, well-calibrated evidence.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

P-KNN is very flexible: it can work with any set of tools, and it keeps improving as new tools become available. Our latest combination even outperforms AlphaMissense in evidence strength.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Meta‑predictors (e.g., REVEL, BayesDel) also integrate tools, but still need separate calibration. P‑KNN combines integration and joint calibration in one step, generating stronger evidence and better calibration.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

This simple method works surprisingly well: 1. Delivering stronger evidence, and 2. More accurate, reliable calibration than single-tool approaches.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

To overcome these limits, we developed Pathogenicity K-Nearest Neighbors (P-KNN). P-KNN jointly calibrates any set of tools into a single pathogenicity probability. It maps scores into a multi-dimensional space and asks: among nearby variants, what fraction are pathogenic?

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

To keep probabilities calibrated, clinicians must pre-commit to one predictor, no switching. This creates two issues: 1. Tool choice is unclear: each excels in different cases, no clear guidance. 2. Pre-commitment blocks future insights: no switching during re-analysis.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

The existing framework allows calibration of only one tool at a time. Using a calibration dataset with labeled pathogenic and benign variants, each prediction score is calibrated into the proportion of pathogenic variants among the variants with similar scores.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Clinical guidelines require variant pathogenicity predictions to be calibrated into probabilities.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Machine-learning predictions are widely used to classify genetic variants as pathogenic. A key obstacle is the clinical guideline requiring pre-commitment to one tool. Our approach lifts this restriction, enabling use of multiple tools with complementary strengths.

1

0

5

Po-Yu Lin

@PoYu_Lin_NCKUH

3 months

So when the two groups are merged, almost all pathogenic variants are splice and almost all benign variants are 5’UTR. You end up with almost perfect separation, just because the model knows to assign more damaging predictions to splice variants.

0

Po-Yu Lin

@PoYu_Lin_NCKUH

3 months

It’s basically Simpson's paradox. To illustrate what’s happening, let’s look at Evo2 for splice & 5’UTR variants. Neither group shows good separation between pathogenic & benign variants, but splice variants get more damaging predictions & are much more likely to be pathogenic.

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

3 months

Measuring model performance by variant type reveals a big anomaly. Evo2, for example, scores AUROC=0.975 on noncoding variants, but much lower on all specific types (e.g. 0.697 for splice, 0.903 for intron, 0.767 for 5′ UTR). Other models show similar pattern. What's going on?

1

0

Po-Yu Lin

@PoYu_Lin_NCKUH

3 months

We created a benchmark of ~250,000 pathogenic & benign variants. Unlike previous benchmarks, we evaluated performance by variant type. We broke down broad categories like ‘noncoding variants’ into specific annotations like intron, 3′ UTR and RNA gene.

1

0