Po-Yu Lin Profile
Po-Yu Lin

@PoYu_Lin_NCKUH

Followers
7
Following
1
Media
10
Statuses
19

Joined September 2025
Don't wanna be here? Send us removal request.
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
Huge thanks to @BrandesNadav for his guidance!
0
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
We expect P-KNN will help improve diagnostic yield in rare disease genetics and make variant interpretation more robust, flexible, and up to date.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
In summary, P-KNN šŸ”¹ Flexibly uses any combination of tools - no pre-commitment required šŸ”¹ Delivers stronger, better-calibrated evidence than a calibrated meta-predictor šŸ”¹ Fully compatible with the ACMG/AMP Bayesian framework
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
Computational tools and deep mutational scans both provide pathogenicity scores. āš ļøGuidelines treat them as separate evidence and add their log likelihood ratios, but they’re not independent, risking miscalibration. āœ… P‑KNN integrates them into one well‑calibrated probability.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
It’s striking how such a simple idea can be so effective - P-KNN consistently turns diverse tool outputs into stronger, well-calibrated evidence.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
P-KNN is very flexible: it can work with any set of tools, and it keeps improving as new tools become available. Our latest combination even outperforms AlphaMissense in evidence strength.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
Meta‑predictors (e.g., REVEL, BayesDel) also integrate tools, but still need separate calibration. P‑KNN combines integration and joint calibration in one step, generating stronger evidence and better calibration.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
This simple method works surprisingly well: 1. Delivering stronger evidence, and 2. More accurate, reliable calibration than single-tool approaches.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
To overcome these limits, we developed Pathogenicity K-Nearest Neighbors (P-KNN). P-KNN jointly calibrates any set of tools into a single pathogenicity probability. It maps scores into a multi-dimensional space and asks: among nearby variants, what fraction are pathogenic?
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
To keep probabilities calibrated, clinicians must pre-commit to one predictor, no switching. This creates two issues: 1. Tool choice is unclear: each excels in different cases, no clear guidance. 2. Pre-commitment blocks future insights: no switching during re-analysis.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
The existing framework allows calibration of only one tool at a time. Using a calibration dataset with labeled pathogenic and benign variants, each prediction score is calibrated into the proportion of pathogenic variants among the variants with similar scores.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
Clinical guidelines require variant pathogenicity predictions to be calibrated into probabilities.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
2 months
Machine-learning predictions are widely used to classify genetic variants as pathogenic. A key obstacle is the clinical guideline requiring pre-commitment to one tool. Our approach lifts this restriction, enabling use of multiple tools with complementary strengths.
1
0
5
@PoYu_Lin_NCKUH
Po-Yu Lin
3 months
So when the two groups are merged, almost all pathogenic variants are splice and almost all benign variants are 5’UTR. You end up with almost perfect separation, just because the model knows to assign more damaging predictions to splice variants.
0
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
3 months
It’s basically Simpson's paradox. To illustrate what’s happening, let’s look at Evo2 for splice & 5’UTR variants. Neither group shows good separation between pathogenic & benign variants, but splice variants get more damaging predictions & are much more likely to be pathogenic.
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
3 months
Measuring model performance by variant type reveals a big anomaly. Evo2, for example, scores AUROC=0.975 on noncoding variants, but much lower on all specific types (e.g. 0.697 for splice, 0.903 for intron, 0.767 for 5′ UTR). Other models show similar pattern. What's going on?
1
0
0
@PoYu_Lin_NCKUH
Po-Yu Lin
3 months
We created a benchmark of ~250,000 pathogenic & benign variants. Unlike previous benchmarks, we evaluated performance by variant type. We broke down broad categories like ā€˜noncoding variants’ into specific annotations like intron, 3′ UTR and RNA gene.
1
0
0