Nadav Brandes @BrandesNadav X Profile

Nadav Brandes

@BrandesNadav

Followers

896

Following

396

Media

36

Statuses

154

#ComputationalBiology and #AI

https://t.co/2oKijpLYDt

New York, USA

Joined June 2016

Don't wanna be here? Send us removal request.

Nadav Brandes

@BrandesNadav

6 days

We're a small and supportive center. Our director, Aravinda Chakravarti, is not only a renowned geneticist but also an incredible colleague who cares deeply about the success of each of the labs.

0

Nadav Brandes

@BrandesNadav

6 days

We’re hiring tenure-track faculty in the Center for Human Genetics and Genomics at NYU! If you have ambitious ideas for transforming human genetics, this is the place for you. https://t.co/8KjyUxmk5m

1

0

Nadav Brandes

@BrandesNadav

2 months

I finished reading “If Anyone Builds It, Everyone Dies: Why Superhuman AI Would Kill Us All”. I liked this book. It gives an intelligent explanation for why some people are extremely concerned about the rapid progress in AI. It makes the case that the ongoing efforts to build

0

6

Nadav Brandes

@BrandesNadav

2 months

I'm really excited about this work from a great student in my lab. It addresses a big gap in the clinical implementation of variant effect predictions, providing well-calibrated pathogenicity probabilities from multiple predictions (without data-expensive meta-predictors).

Po-Yu Lin

@PoYu_Lin_NCKUH

2 months

Machine-learning predictions are widely used to classify genetic variants as pathogenic. A key obstacle is the clinical guideline requiring pre-commitment to one tool. Our approach lifts this restriction, enabling use of multiple tools with complementary strengths.

0

2

16

Nadav Brandes

@BrandesNadav

3 months

This work was done by three incredible students! @PoYu_Lin_NCKUH @BaiyuLu66681 @XueshenLiu

2

0

21

Nadav Brandes

@BrandesNadav

3 months

To ensure that future progress is meaningful, we provide the full benchmark & code. Check out our preprint:

biorxiv.org

Recent studies have reported unprecedented accuracy predicting pathogenic variants across the genome, including in noncoding regions, using large AI models trained on vast genomic data. We present a...

3

8

78

Nadav Brandes

@BrandesNadav

3 months

Lesson: near-perfect prediction of pathogenic variants across ‘all variants’ is an illusion. Variant-type-specific evals are needed to know when models are actually good and useful.

1

3

31

Nadav Brandes

@BrandesNadav

3 months

Model comparison: -- GPN-MSA = most robust DNA model -- AlphaMissense = most robust protein model -- AlphaGenome & Evo2 = strong in some variant types, very unstable in others No single model is best across the board.

1

3

20

Nadav Brandes

@BrandesNadav

3 months

Once you control for variant type, a clearer picture emerges. Reliable performance (AUROC>0.9) is achieved for missense, synonymous, non-splice intron, 3′ UTR & RNA gene variants. By contrast, stop-gain, start-loss, stop-loss, splice & 5′ UTR variants remain difficult.

2

3

20

Nadav Brandes

@BrandesNadav

3 months

To show how serious this is, we included a simple rule-based baseline that only uses variant type information (no sequences, no AI). It achieves AUROC=0.944 across noncoding variants. The reported numbers suddenly look much less impressive.

2

6

63

Nadav Brandes

@BrandesNadav

3 months

So when the two groups are merged, almost all pathogenic variants are splice and almost all benign variants are 5’UTR. You end up with almost perfect separation, just because the model knows to assign more damaging predictions to splice variants.

1

0

20

Nadav Brandes

@BrandesNadav

3 months

It’s basically Simpson's paradox. To illustrate what’s happening, let’s look at Evo2 for splice & 5’UTR variants. Neither group shows good separation between pathogenic & benign variants, but splice variants get more damaging predictions & are much more likely to be pathogenic.

1

16

Nadav Brandes

@BrandesNadav

3 months

Measuring model performance by variant type reveals a big anomaly. Evo2, for example, scores AUROC=0.975 on noncoding variants, but much lower on all specific types (e.g. 0.697 for splice, 0.903 for intron, 0.767 for 5′ UTR). Other models show similar pattern. What's going on?

1

14

Nadav Brandes

@BrandesNadav

3 months

We created a benchmark of ~250,000 pathogenic & benign variants. Unlike previous benchmarks, we evaluated performance by variant type. We broke down broad categories like ‘noncoding variants’ into specific annotations like intron, 3′ UTR and RNA gene.

1

3

20

Nadav Brandes

@BrandesNadav

3 months

Latest genomic AI models report near-perfect prediction of pathogenic variants (e.g. AUROC>0.97 for Evo2). We ran extensive independent evals and found these figures are true, but very misleading. A breakdown of our new preprint: 🧵

9

118

484

Nadav Brandes

@BrandesNadav

9 months

I sometimes wonder if ChatGPT has a neural circuitry equivalent to eye rolling, activated whenever it decides to play along despite thinking I'm just being an idiot.

0

4

Nadav Brandes

@BrandesNadav

9 months

That said, I’m super excited about the possibilities this work opens up. It’s a long paper, and there are parts I haven’t read yet that look really cool. Thanks for releasing this!

0

1

Nadav Brandes

@BrandesNadav

9 months

I still think masked LMs (with bidirectional attention) make more sense for genomics than autoregressive models, especially for variant effect prediction. Also still not convinced that Hyena is better than regular transformers. I guess time will tell.

1

0

4

Nadav Brandes

@BrandesNadav

9 months

I also wish the paper provided more clarity on how exactly they extracted variant effect predictions from Evo2 and other LMs. One oddity: ESM1b performs much worse on non-SNV variants in the Evo2 preprint (ROC-AUC=0.8) than in our 2023 paper (ROC-AUC=0.87) with @vntranos and

2

0

Nadav Brandes

@BrandesNadav

9 months

Evo2 classifies noncoding variants almost perfectly, which makes me think these are relatively easy cases (e.g. canonical splice sites). The large context size (~8,000 nt) also suggests it’s mostly local effects.

1

0

3