Pascal Notin
@NotinPascal
Followers
1K
Following
2K
Media
43
Statuses
443
Research in AI for Protein Design @Harvard | Prev. CS PhD @UniofOxford, Maths & Physics @Polytechnique
Boston
Joined September 2020
𧬠π΄ TIRED: Scaling protein models to billions of parameters hoping they'll memorize all of evolution and generalize beyond π₯ WIRED: Smart retrieval-augmented models that dynamically access what they need from sequence databases
π¨ICML Paper Alertπ¨ What if finding the right protein homologs wasn't a slow search, but a learned part of the model itself? We introduce ππ«π¨ππ«π’ππ―ππ«, an end-to-end framework that learns to retrieve the most useful homologs for self-supervised reconstruction! (1/12)
1
5
91
Reminder - PhD applications for OATML are now open The first funding deadline is December 2 - candidates interested in developing Bayesian deep learning methodology, applications of ML, AI security, and understanding ML methodology are encouraged to apply More info:
oatml.cs.ox.ac.uk
The Oxford Applied and Theoretical Machine Learning Group (OATML) is a research group within the Department of Computer Science of the University of Oxford led by Prof Yarin Gal. We come from...
3
47
177
Thrilled to announce I'm starting as a Principal Investigator at #Aithyra in Vienna! We'll be developing generative models to understand cell biology and design proteins. I'm hiring PhDs, Postdocs, & Visiting Researchers! PhD applications by Sept 10:
29
39
327
Our paper on generalizable antibody-antigen binding affinity prediction has been featured on the cover of the August Issue of @NatComputSci! ππ
π¨Our August issue is now live and includes research on antibody-antigen binding, molecular screening for zeolite synthesis, psychological experiments with LLMs, and much more! https://t.co/Hi4HjAXndD
2
8
52
Iβve thoroughly enjoyed reading two (VERY!) recent papers that model protein sequences by retrieving evolutionary information (dynamically) at inference time, and there's a lot to unpack! [1] https://t.co/NWzDzvYALu [2] https://t.co/H4tWxZwScl (1/n)
2
28
188
A generative artificial-intelligence tool has designed a synthetic CRISPR system that successfully edits human DNA https://t.co/D0ozo6rPaY
35
138
467
1/5 Biological data is noisy, redundant, and ever-growing. π£οΈ In our new paper (first paper of my post doc!! β‘οΈ), we track model performance across 14 years of UniRef100 snapshots to ask: how does pLM performance scale with training data?
1
24
107
0
0
1
AI expands the repertoire of CRISPR-associated proteins for genome editing @NatureNV preview by @NotinPascal
https://t.co/FgBiemIy8n
@thisismadani
https://t.co/79ihnoXQfU
@jeffruffolo @AadyotB et al https://t.co/Bq2nmEP35V
0
2
6
My @Nature News & Views on this breakthrough: https://t.co/QghloVQCyK Special thanks to @AvivSpinner for their valuable feedback, and to @nature and @NatureNV for the support in writing this piece! π
nature.com
Nature - A generative artificial-intelligence tool has designed a synthetic CRISPR system that successfully edits human DNA and sharply reduces off-target effects.
1
0
12
Congratulations to the entire @ProfluentAI team on this incredible milestone! OpenCRISPR-1 represents a paradigm shift - the first AI-designed CRISPR protein to successfully edit human DNA with fewer off-target effects. We're moving from discovery-based to engineered biology. π§¬
Excited to have our AI research published in @Nature today. Proud of the @ProfluentBio team and the extensive final version available under open-access. OpenCRISPR is a milestone. It's the first successful demonstration of editing the human genome with a molecule fully designed
1
4
74
1/4 π Announcing the 2025 Protein Engineering Tournament. This yearβs challenge: design PETase enzymes, which degrade the type of plastic in bottles. Can AI-guided protein design help solve the climate crisis? Letβs find out! β¬οΈ #AIforBiology #ClimateTech #ProteinEngineering
6
73
194
Save the date! Machine Learning for Drug Discovery (MLDD) is happening soon on Monday 30 June, 2025. MLDD aims to bring together ML for drug discovery experts, innovators, and enthusiasts from the machine learning, biotechnology and drug discovery domains in London, UK to
3
28
162
Congratulations to the entire RNAGym team! @rohitarorayyc @mvrfalo @christian_choe_ @c_sheare @aaronkollasch @fiona__qu @ruben_weitzman @artemg2718 @sarahgurev Erik Xie @deboramarks 8/9
1
0
6
The moderate performance across all tasks reveals exciting opportunities! Key directions: RNA-specific training data, integrating structure-function relationships, and improving non-canonical base pair prediction. RNAGym provides a standardized foundation for progress. 7/9
1
0
1
π Tertiary structure: 215 diverse 3D structures from recent PDB entries. NuFold leads monomers (0.393 TM-score), AlphaFold3 dominates complexes (0.381 TM-score). Non-Watson-Crick interactions remain a major challenge for all methods 6/9
2
4
7
π Secondary structure: 901k chemical mapping profiles using DMS & 2A3 reactivity. EternaFold achieves top performance (0.656 F1-score), closely followed by CONTRAfold & Vienna. Traditional thermodynamic methods are still competitive with newer deep learning approaches 5/9
1
0
1
π¬ Fitness prediction: 70 assays across tRNA, ribozymes, aptamers & mRNAs (1M+ mutations). Evo 2 performs best overall, but performance varies dramatically by RNA type: RNA-FM excels at tRNA/aptamers while Evo 2 leads mRNA tasks. Lots of room for improvement across the board! 4/9
1
0
2
RNAGym tackles three essential RNA prediction tasks: π¬ Fitness prediction: How mutations affect RNA function π Secondary structure: Base-pairing patterns π Tertiary structure: 3D molecular architecture All evaluated zero-shot to test true generalization! 3/9
1
0
2