Aiden Kolodziej Profile
Aiden Kolodziej

@aidenosinetrip1

Followers
128
Following
352
Media
14
Statuses
70

MIT Biology https://t.co/WSoeyApucV

Joined April 2020
Don't wanna be here? Send us removal request.
@ProteinBoston
Boston Protein Design and Modeling Club
5 days
Join us Wednesday December 10th for an amazing seminar by @ChoYehlin to cap off 2025. See you at 7pm EST in Room 6055, Longwood Center @DanaFarber "How AF3-Style Structure Prediction Models Can Be Used for Protein Design: BoltzDesign and Protein Hunter" https://t.co/E8bDyGzirb
bpdmc.org
introduction and membership Boston Protein Design and Modeling Club (BPDMC) is a community of computational protein engineers and modelers from both academia and industry. While we are based in...
0
8
18
@aidenosinetrip1
Aiden Kolodziej
6 days
Another example showing how a common SH3 fold may only match at the CATH "class" level. If you're splitting your train/test by topology...you've got data leakage 🚰
0
0
4
@aidenosinetrip1
Aiden Kolodziej
6 days
One example from CIRPIN ( https://t.co/SMO3AOX9xe) where two very similar structures have entirely different CATH classifications
@sokrypton
Sergey Ovchinnikov
6 days
🚨 For those training DL models on proteins, it's possible your "structural" train/test split might have leakage cus tools like foldseek/TMalign (and CATH/SCOP databases) do not always account for structural relationship of circularly permuted proteins:
1
2
29
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab We've also developed a Colab Notebook where you can try out CIRPIN for yourself! If you're looking for remote homologs that may be related by CP, insertions, extensions, rewirings, or other rearrangements, try it out and let us know how it worked! https://t.co/hiONSCJWE9
Tweet card summary image
github.com
Source code for CIRPIN: Learning Circular Permutation-Invariant Representations to Uncover Putative Protein Homologs - aidenkoloj/CIRPIN
0
1
5
@ZhidianZ
Zhidian Zhang
9 days
Excited to share this work with @yoakiyama @ChoYehlin @jajoosam @sokrypton We find that protein language models trained solely on individual protein sequences, implicitly learn the interface contacts of homo-oligomeric assemblies! As the model scales up, more interface signals
4
32
139
@aipulserx
DailyHealthcareAI
8 days
🚀Sergey Ovchinnikov's paper is here!! Can a deep learning model be trained to recognize proteins with identical 3D structures but different sequence connectivity, revealing thousands of hidden evolutionary relationships? "CIRPIN: Learning Circular Permutation-Invariant
0
26
123
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab In addition to CPs, CIRPIN allowed us to uncover more complex topological rearrangements. We identified cases of rewiring, where the connectivity of secondary structures differs, as well as pairs of similar proteins obscured by insertions/extensions:
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab Investigating the node-level embeddings of CIRPIN/Progres revealed that CIRPIN captures tertiary motifs. CIRPIN identifies regions of similarity within proteins that Progres missed. 3cbn is an interesting example since the two halves of the protein are nearly identical:
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab Shifting to the model interp side of things: PCA of CIRPIN/Progres embeddings shows how CIRPIN groups CPs of the same structure together. This can be thought of as a "folding" of the embedding space:
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab There's a lot of interesting evolutionary q's to dive into here, saved for another thread. But it's worth highlighting how there's been decades of very detailed work on PDZ domains. How might knowledge of these four CPs change what we know about PDZ form and function?
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
@niopeklab We then used CIRPIN/Progres to investigate the PDZ topology in the AFDB cluster representatives and found that PDZs exist in 4 circularly permuted forms:
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
Among the pairs we discovered in SCOP, were PDZ domains, recently reported to be the most frequently inserted domains in the AFDB by the @niopeklab
1
0
1
@aidenosinetrip1
Aiden Kolodziej
7 days
We could then use a contrastive approach to identify novel CPs by searching for pairs with high CIRPIN scores and low Progres scores
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
Training with synCPs allows our model, CIRPIN, to find similarity between known cases of circular permutants which Progres previously failed to identify:
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
Think of how comma placement in English can drastically change the meaning of a sentence. A panda can be a cute docile animal or a serial killer, depending on your inclusion of a comma.
1
0
3
@aidenosinetrip1
Aiden Kolodziej
7 days
Under the hood, synCP generation is simply shifting the positional information of each structure. But the consequences are significant.
1
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
What happens during training is that random synCPs of input structures are generated, forcing the model to learn that structures related by CP are similar:
2
0
0
@aidenosinetrip1
Aiden Kolodziej
7 days
Building off the Progres model by @jgreener64 @jamaliki1998, we introduced a novel data augmentation strategy using synthetic circular permutations (synCPs)
1
0
1
@aidenosinetrip1
Aiden Kolodziej
7 days
We wondered if there was a way to leverage the speed of deep learning based search tools, with the sensitivity of traditional structural alignment.
1
0
0