Dmitry Penzar @dmitrypenzar X Profile

Dmitry Penzar

@dmitrypenzar

Followers

515

Following

5K

Media

53

Statuses

534

PhD in bioinformatics, ML researcher, teacher

Joined May 2016

Don't wanna be here? Send us removal request.

Dmitry Penzar

@dmitrypenzar

2 years

(1/8)The LegNet paper is finally published https://t.co/ROOParhEwq. Congrats to @halfacrocodile @WWenya @DariaNogina @ZinkevichA

academic.oup.com

AbstractMotivation. The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of comple

4

16

60

Dawn Chen

@dawnchenx

5 days

Announcing our new preprint! We built SPICE, a framework that combines large-scale experiments and generative AI to design RNA sequences that control cell type-specific gene expression using alternative splicing - a powerful new modality! (1/10) Preprint:

biorxiv.org

Programmable control of gene expression in specific cell types is essential for both basic discovery and therapeutic intervention, yet current strategies lack scalability across diverse cellular...

2

18

90

Dmitry Penzar

@dmitrypenzar

9 days

Now we just need to find those scientists. I’d start with Bluesky — maybe find one specialist to make their newsfeed a bit less of a mess

Matthew B. Jané

@MatthewBJane

11 days

Not everything needs to be a peer-reviewed academic paper.

0

2

Saint Louis Chess Club

@STLChessClub

22 days

We are deeply saddened by the unexpected passing of Grandmaster Daniel Naroditsky. Daniel was not only a friend of the Saint Louis Chess Club, but a gifted player, educator, and beloved pillar of the chess community. His passion for the game and commitment to teaching inspired

44

247

3K

Krish Rastogi

@krishras23

22 days

We lost an educator, a player, a coach, and one of the best speed chess players of this generation. Rest in peace Naroditsky, we all miss you. I'm also going to leave this clip here, it speaks for itself.

Vladimir Kramnik

@VBkramnik

22 days

Seemingly, conflicts with @chesscom, @freestylechess1, both kicking him out from commentator role,had a big impact lately on @GmNaroditsky. Got the stream episodes. Not a doctor but looks like something "very else" than sleeping pills. Hope,if any, real friends of him will care

47

403

7K

Steven Yu

@stevenyuyy

25 days

Glad you like it! It's the nano-protein-viewer that I built over the summer Really need to work on marketing lol

Diego del Alamo

@DdelAlamo

25 days

@stevenyuyy Whoa what plugin is this? So much prettier than protein viewer

5

18

164

Dmitry Penzar

@dmitrypenzar

29 days

Colleagues have also pointed out this paper

academic.oup.com

Abstract. Identifying protein–protein interactions (PPIs) is crucial for deciphering biological pathways. Numerous prediction methods have been developed a

0

3

Dmitry Penzar

@dmitrypenzar

29 days

Great story

Biology+AI Daily

@BiologyAIDaily

1 month

Protein Language Models are Accidental Taxonomists 1. A new study revealing a significant issue in protein-protein interaction (PPI) prediction models. These models, which use protein language models (pLMs), have been found to exploit phylogenetic distances rather than genuine

0

1

Das Lab

@RDasLab

2 months

The results are in: top codes in Stanford #RNA 3D Folding @kaggle are competitive with CASP16-leading humans Vfold, beat AlphaFold 3. Top team’s trick was template-based modeling, not #DeepLearning. Congrats: john, odat, Eigen, + all 1706 participants! https://t.co/EgzN3DTKNe

1

14

76

Dmitry Penzar

@dmitrypenzar

2 months

While I still not convinced Shorkie really need pretraining stuff and the same result can't be achieved by carefull selection of hyperparameters, I'm clearly amazed by the clarity of work done and honest comparison (not always in favor of Shorkie) with MPRA-trained models

Anshul Kundaje (anshulkundaje@bluesky)

@anshulkundaje

2 months

2 cool papers on sequence-to-gene expression models in yeast https://t.co/6ozI9gSb6x (pretrains a fungal DNALM -> fine tunes on yeast expression & ChIP-exo profiles) https://t.co/yLErrIjIS1 (directly trains on expression profiles) Both use modified Borzoi architectures 1/

0

Anshul Kundaje (anshulkundaje@bluesky)

@anshulkundaje

2 months

Another thing that is maybe less emphasized in this paper is that CLINVAR is a great database of curated pathogenic/benign variants but it is extremely biased (in all sorts of ways) & should never be used as a representative benchmark dataset for most types of variants. 1/

Nadav Brandes

@BrandesNadav

2 months

Latest genomic AI models report near-perfect prediction of pathogenic variants (e.g. AUROC>0.97 for Evo2). We ran extensive independent evals and found these figures are true, but very misleading. A breakdown of our new preprint: 🧵

1

21

166

Peter Koo

@pkoo562

2 months

2025 Machine Learning in Computational Biology (#MLCB) meeting starts TODAY (9/10) at 9:30 (EST)! We have a great lineup of keynotes, contributed talks, and posters today and tomorrow! Schedule: https://t.co/wN8z3SeD8Y Join for free via livestream:

mlcb.org

The in-person component will be held at the New York Genome Center, 101 6th Ave, New York, NY 10013. All times below are Eastern Time.

1

11

61

Nadav Brandes

@BrandesNadav

2 months

It’s basically Simpson's paradox. To illustrate what’s happening, let’s look at Evo2 for splice & 5’UTR variants. Neither group shows good separation between pathogenic & benign variants, but splice variants get more damaging predictions & are much more likely to be pathogenic.

1

16

Anshul Kundaje (anshulkundaje@bluesky)

@anshulkundaje

2 months

The benchmark task is "batch correction" while preserving biological variation. This task was supposedly benchmarked in all these foundation model papers. But they r apparently very poor even for batch correction? What is going on?!?

6

1

28

Lior Pachter

@lpachter

2 months

In a new work with @Josephmrich and Conrad Oakes we tackle the problem of how to best organize alluvial plots. We formalize two optimization problems and develop a solution for them based on the neighbornet algorithm, implemented in the program wompwomp: https://t.co/njQRkjYHNh

2

19

64

Dmitry Penzar

@dmitrypenzar

2 months

https://t.co/8Tv9xTGpji

9

21

267

Timothy Fuqua

@timothy_fuqua

3 months

Excited to release our study on the emergence of new promoters in random vs genomic DNA. Posting the thread on the other place :) https://t.co/QpzSAmrMky

biorxiv.org

Promoters are DNA sequences that help to initiate transcription. Point mutations can create de-novo promoters, which can consequently transcribe inactive genes or create novel transcripts. We know...

3

18

103

Jacob Schreiber

@jmschreiber91

3 months

In the genomics community, we have focused pretty heavily on achieving state-of-the-art predictive performance. While undoubtedly important, how we *use* these models after training is potentially even more important. tangermeme v1.0.0 is out now. Hope you find it useful!

3

23

97

Jacob Schreiber

@jmschreiber91

3 months

An excellent post about the receptive range of convolution models. "You might reasonably ask: "If I have 100 layers with W=1000W=1000, that's a theoretical receptive field of 100,000 tokens. Doesn't that matter?" The answer is no, and here's why:" https://t.co/X1xDNudVZh

guangxuanx.com

Modern LLMs use sliding window attention for efficiency, but why can't stacking sliding windows see as far as theory suggests? A mathematical exploration of information dilution and the exponential...

1

2

18

Sarah Gurev

@sarahgurev

3 months

🚨New paper 🚨 Can protein language models help us fight viral outbreaks? Not yet. Here’s why 🧵👇 1/12

1

37

155

Dmitry Penzar

@dmitrypenzar

3 months

One can just check Phenformer roc-aucs for many diseases in supplements ( https://t.co/jM8CBKRrID) Keeping in mind that the model is evaluated in the most friendly setting (split without accounting for population structure)

arxiv.org

Understanding how molecular changes caused by genetic variation drive disease risk is crucial for deciphering disease mechanisms. However, interpreting genome sequences is challenging because of...

Anshul Kundaje (anshulkundaje@bluesky)

@anshulkundaje

3 months

Utterly uninformed take. We don't know how to pinpoint causal variants accurately for polygenic phenotypes (AD, Diabetes etc.) & good luck editing 100s/1000s of variants in embryos without understanding their pleiotropic effects, oh & ignore those off-target effects.

0

1