
Soojung Yang
@SoojungYang2
Followers
940
Following
1K
Media
3
Statuses
189
Molecular simulations and AI for proteins! PhD student @ MIT @RGBLabMIT / website: https://t.co/xHAeCEclfm
Joined September 2020
Thank you UW–Madison for this great honor! In 2016 I changed my group to AI4Science and went all-in on deep learning solutions for the molecular sciences. Best decision of my career in hindsight. Amazing what's become of that field - now AI for Science is a @MSFTResearch lab.
Congratulations to Frank Noé, Partner Research Manager at Microsoft Research AI for Science, who received the 2025–2026 Joseph O. Hirschfelder Prize in Theoretical Chemistry for pioneering contributions at the interface of AI and chemistry. https://t.co/FMCPmbfAW1
9
5
64
Want to join our efforts @MSFTResearch AI for Science to push the frontier of AI for materials? We are the team behind MatterGen & MatterSim and we have 2 job openings! Each can be in Amsterdam, NL, Berlin, DE, or Cambridge, UK. It is a rare opportunity to join a highly talented,
0
24
76
Our development of machine-learned transferable coarse-grained models in now on Nat Chem! https://t.co/HGngd8Vpop I am so proud of my group for this work! Particularly first authors Nick Charron, Klara Bonneau, @sayeg84, Andrea Guljas.
nature.com
Nature Chemistry - The development of a universal protein coarse-grained model has been a long-standing challenge. A coarse-grained model with chemical transferability has now been developed by...
3
33
107
@genbio_workshop @LucasPinede @junonam_ @RGBLabMIT @sihyun_yu Unfortunately both @LucasPinede and I won't be at ICML in person this year. But our poster will be at the workshop 😂 Feel free to DM us or email our lead author (lucasp at https://t.co/Zjbm3tdReJ) about the paper! Preprint coming very soon, please tune in! (4/n, N=4)
0
0
3
@genbio_workshop @LucasPinede @junonam_ @RGBLabMIT At low noise, emulator scores ≈ physical forces. Even w/o alignment loss, emulator embeddings drift toward MLIP's. Inspired by @sihyun_yu’s REPA, we add a loss to encourage this — and training becomes faster & better. (3/n)
1
0
3
@genbio_workshop @LucasPinede @junonam_ @RGBLabMIT Boltzmann emulators can sample conformations cheaper than MD but are expensive to train. They need equilibrium data, which is very hard to get. But… MLIPs like MACE are trained on tons of noneq. data. What if we transfer knowledge from force models to generative models? (2/n)
1
0
1
🚀 Come check our poster at ICML @genbio_workshop! We show that pretrained MLIPs can accelerate training of Boltzmann emulators — by aligning their internal representations. Coauthors @LucasPinede, @junonam_, @RGBLabMIT (1/n)
2
18
150
Understanding protein motion is essential to understanding biology and advancing drug discovery. Today we’re introducing BioEmu, an AI system that emulates the structural ensembles proteins adopt, delivering insights in hours that would otherwise require years of simulation.
Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://t.co/WwKjj5B0eb
166
424
3K
Happy to have contributed to this awesome project!
Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://t.co/WwKjj5B0eb
1
7
86
Huge shoutout to the outstanding team: Sarah Lewis, Tim @tmhmpl, José @josejimlun, Michael Gastegger, Yu @YuuuXie, Andrew @AndrewFoongYK, Victor @vgsatorras, Osama @OsamaAbdin_, Bas @BasVeeling, Iryna Zaporozhets, Yaoyi @hello_yaoyi, Soojung @SoojungYang2 ...
1
2
8
BioEmu now published in @ScienceMagazine !! What is BioEmu? Check out this video: https://t.co/PAj96iKvR7
Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://t.co/WwKjj5B0eb
15
108
402
Our review paper is now live! 🚀 We dive into how in-situ cryoET allows us to not only view proteins inside their native cellular habitat but also reconstruct continuous protein dynamics at near atomic resolution! arXiv Preprint: https://t.co/Yiqr3OkKDI (feedback is welcome!)
arxiv.org
Cryo-electron tomography (cryo-ET) has emerged as a powerful tool for studying the structural heterogeneity of proteins and their complexes, offering insights into macromolecular dynamics directly...
0
1
1
🚀 Excited to release BoltzDesign1! ✨ Now with LogMD-based trajectory visualization. 🔗 Demo: https://t.co/tjBXSibZhT Feedback & collabs welcome! 🙌 🔗: GitHub: https://t.co/HsQBrB8amQ 🔗: Colab: https://t.co/TzX4q5m2qp
@sokrypton @MartinPacesa
7
71
365
I'll be attending @iclr_conf and the @gembioworkshop! 🧠✨If you're around and want to meet up or chat, feel free to DM me! 😊
0
1
29
New paper (and #ICLR2025 Oral :)): ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids https://t.co/V13RjK6l56 Condition on your 3D layout (of ellipsoids) to generate proteins like this or to get better designability/diversity/novelty tradeoffs. 1/6
6
52
273
Spending the Easter days prepping for #ICLR2025 in #singapore 😍🤩 If you want to meet/catch up on @cusp_ai #AI4Science #MachineLearning and/or ML for #materials / #simulation ping me.! DMs are open :) My lovely colleagues @sinjax @sindy_loewe and @gverkes are around too!
1
2
26
Excited for #ICLR2025! I'll be there 4/25~ to organize the GEM Bio Workshop @gembioworkshop to bridge experiments and ML in bio 🥰 If you want to meet up on ML for proteins / dynamics / simulation, DM me! + I'm curious to learn about academic/industry opportunities out there!
0
2
58
1/ Machine learning force fields are hot right now 🔥: models are getting bigger + being trained on more data. But how do we balance size, speed, and specificity? We introduce a method for doing model distillation on large-scale MLFFs into fast, specialized MLFFs!
2
21
101
Meet BioEmu-1 from Microsoft Research. This deep learning model can generate thousands of protein structures per hour, unlocking new possibilities for protein scientists and drug discovery and research. https://t.co/uyuvINvLGS
55
395
2K
The BioEmu-1 model and inference code are now public under MIT license!!! Please go ahead, play with it and let us know if there are issues. https://t.co/K7wwHmCt2o
github.com
Inference code for scalable emulation of protein equilibrium ensembles with generative deep learning - microsoft/bioemu
Super excited to preprint our work on developing a Biomolecular Emulator (BioEmu): Scalable emulation of protein equilibrium ensembles with generative deep learning from @MSFTResearch AI for Science. #ML #AI #NeuralNetworks #Biology #AI4Science
https://t.co/yzOy6tAoPv
5
96
356