Alex J Li Profile
Alex J Li

@alex_j_li

Followers
128
Following
6
Media
7
Statuses
14

MIT '22 | UCB-UCSF BioE PhD student @kortemmelab & Pinney Lab | Interested in geometric/graph ML and program synthesis in chemistry and biomolecular design

Joined March 2022
Don't wanna be here? Send us removal request.
@alex_j_li
Alex J Li
3 years
First twitter thread🧵and also my first BioRxiv preprint! I’m excited to finally release my undergrad work into the world: combining GNNs, Potts models, and Tertiary Motifs (TERMs) for protein design! See the preprint here: https://t.co/aiVoRA8g6S 1/
2
14
68
@alex_j_li
Alex J Li
9 months
At NeurIPS this weekend presenting an MLSB poster on my current progress on ProteinZen, an all-atom protein structure generation method: find me there or DM me to chat about all things protein! Paper: https://t.co/Dgw5fkRDmD
Tweet media one
2
12
77
@alex_j_li
Alex J Li
3 years
I'd like to thank my advisors Amy and Gevorg for all their support and guidance, my co-authors Mindren Israel @vikramsundar for being fun collaborators with great ideas, and @SassSeabass for teaching me all things dTERMen. Without them, none of this would have been possible! 11/
1
0
3
@alex_j_li
Alex J Li
3 years
Lastly, we find a disconnect between NSR and other energy-based metrics: Potts param regularization improves NSR but not energetics predictions, suggesting future directions for energy-based objectives when training and evaluating new protein design models. 10/
Tweet media one
1
0
3
@alex_j_li
Alex J Li
3 years
Our models can also be improved in a generalizable fashion via finetuning on experimental data. Finetuning on the Bcl-2 affinity dataset increases performance on Rocklin stability predictions, despite the Rocklin dataset being composed of de novo folds. 9/
Tweet media one
1
0
3
@alex_j_li
Alex J Li
3 years
Additionally, despite not explicitly trained to predict energies, our models have good performance on energy-based tasks, including a Bcl-2 binding affinity dataset (Frappier et al. Structure 2019) and a de novo protein stability dataset (Rocklin et al. Science 2017). 8/
Tweet media one
1
0
3
@alex_j_li
Alex J Li
3 years
To assess fold specificity, we show that designed sequences (B) tend to fold to their target structure as predicted by AlphaFold. The same trend is not observed when using a randomized-sequence baseline (D). 7/
Tweet media one
1
0
1
@alex_j_li
Alex J Li
3 years
Via model ablations, we show that TERMs and the Potts model output both contribute to increasing NSR. In designed sequences, we also see physiochemically realistic substitutions occur when a non-native residue label is chosen. 6/
Tweet media one
1
0
2
@alex_j_li
Alex J Li
3 years
We present two models: TERMinator, using both TERM and backbone coordinates as input, and COORDinator, using only coordinates. Both output protein-specific Potts models over sequence labels, which can be optimized for sequence design or used to predict energies of mutations. 5/
1
0
2
@alex_j_li
Alex J Li
3 years
In this work, we tackle these concerns by 1) using TERMs, small recurring local-in-space structural motifs, to implicitly model protein flexibility, and 2) predicting an energy landscape (Potts model) over sequence space as output rather than a direct sequence. 4/
Tweet media one
1
0
3
@alex_j_li
Alex J Li
3 years
However, current models assume static backbone structures without allowing structural flexibility, and also directly design sequence on structure, which can be difficult to adapt to energy-based questions or sequence optimization under discrete constraints. 3/
1
1
4
@alex_j_li
Alex J Li
3 years
Neural nets have been taking protein design by storm, one of the most promising results being strong performance on native sequence recovery (NSR) tasks, where the task is to predict the native sequence of a protein given only its backbone structure. 2/
1
1
4