Neil Tenenholtz @ntenenz X Profile

Neil Tenenholtz

@ntenenz

Followers

960

Following

614

Media

145

Statuses

1K

Multimodal model training for biology / healthcare at MSR

Boston, MA

Joined February 2016

Don't wanna be here? Send us removal request.

Lester Mackey

@LesterMackey

30 days

If you're a PhD student interested in interning with me or one of my amazing colleagues at Microsoft Research New England (@MSRNE, @MSFTResearch) this summer, please apply here https://t.co/DIkXUuK4zc (If you'd like to work with me, please include my name in your cover letter!)

9

72

425

Neil Tenenholtz

@ntenenz

1 month

Don't like the status quo? Change it. You can just do things!

rohan anil

@_arohan_

1 month

Since folks are discussing Infra, it is not about models per say, its about agency: Two incidents that I fondly remember: covid happened and meets was slow, a senior engineer and a friend decided to take it own their hands profiling things and making it better. They did a

0

2

Dylan Foster 🐢

@canondetortugas

2 months

MSR NYC is hiring spring and summer interns in AI/ML/RL!

9

27

413

Ava Amini

@avapamini

2 months

Applications for @MSFTResearch undergrad research internships for rising juniors and seniors are due Monday Oct 6! Apply to work with us in BioML 👉 https://t.co/QYjsdyEpRQ w/ @KevinKaichuang, @alexijielu, @lorin_crawford, Kristen Severson, @ntenenz, @SarahAlamdari, and more!

3

22

125

Lester Mackey

@LesterMackey

2 months

I’m pretty sure this is what they designed Sora 2 for (sound on) @raazdwivedi @AShettyV

0

1

8

Dylan Foster 🐢

@canondetortugas

2 months

Microsoft Research New York City is seeking applicants for multiple Postdoctoral Researcher positions in ML/AI! These are positions for up to 2 years, starting in July 2026. Application deadline: October 22, 2025

5

46

254

Dinghuai Zhang 张鼎怀

@zdhnarsil

3 months

Blog updated! Notably, more ablation analysis compared with other importance sampling variant.

Feng Yao

@fengyao1909

3 months

We are glad that TIS and FlashRL have received broad attention from the open-source community that they have been verified and supported (OpenRLHF @hijkzzz, SkyRL @NovaSkyAI, REINFORCE++@hijkzzz, OAT @zzlccc)! A few updates on our blog and FlashRL package: (1) more in-depth

0

1

8

Neil Tenenholtz

@ntenenz

3 months

Just waiting for the rust-impl, free-threaded, easier-to-debug @astral_sh python interpreter.

samsja

@samsja19

3 months

impressed by the execution of the @astral_sh team, taking over the whole python ecosystem in couple of months and already pushing great product for entreprise it's all just about execution

0

1

Neil Tenenholtz

@ntenenz

4 months

For more info and links to all the resources, check out the blog post: https://t.co/L0RSqhPSbD And of course, a huge shoutout to the entire team for making this happen: @KevinKaichuang @SarahAlamdari Alex J Lee Kaeli Kaymak-Loveless @samir_char @garykbrixi @cdomingoenrich

microsoft.com

A collection of both protein sequence data and generative models, designed to serve as a modern resource for protein biology in the age of AI.

0

1

Neil Tenenholtz

@ntenenz

4 months

What did we uncover? 🎉 Model scale, data scale, and data diversity all positively impact E. coli expression. 🎉 ☹️ Unfortunately, common computational metrics are poor predictors of expressibility. ☹️ To all those interested in better PLM evals... let the chase begin!

1

0

3

Neil Tenenholtz

@ntenenz

4 months

Modern LM training is a game of 🐱 and 🐭. You improve training signal (e.g., data) to overcome gaps in eval performance, only to strengthen the evals and thus discover new gaps -- starting the cycle anew. With the Dayhoff Atlas, we aim to jumpstart the same race for PLMs. We

1

7

25

Vega Shah

@dr_alphalyrae

4 months

The Dayhoff Atlas: scaling sequence diversity for improved protein generation | bioRxiv https://t.co/TaHb3mFEvY

2

9

35

Alexander Amini

@xanamini

4 months

🧬 The largest open dataset of natural proteins in the world — 3.3 billion seqs 🧠 A 3 billion param hybrid ssm+transformer model 🤗 Fully open-source data + model https://t.co/8X0VfYIQwq Congrats to @avapamini + entire team, including @LiquidAI_'s own Kaeli Kaymak-Loveless

biorxiv.org

Modern biology is powered by the organization of biological information, a framework pioneered in 1965 by Margaret Dayhoff’s Atlas of Protein Sequence and Structure. Databases descended from this...

Ava Amini

@avapamini

4 months

thrilled to share The Dayhoff Atlas of protein language data and models 🚀 protein biology in the age of AI! https://t.co/4wP9kNRUoM we built + open source the largest natural protein dataset, w/ 3.3 billion seqs & a first-in-class dataset of structure-based synthetic proteins

2

13

87

Kevin K. Yang 楊凱筌

@KevinKaichuang

4 months

In 1965, Margaret Dayhoff published the Atlas of Protein Sequence and Structure, which collated the 65 proteins whose amino acid sequences were then known. Inspired by that Atlas, today we are releasing the Dayhoff Atlas of protein sequence data and protein language models.

6

91

297

Neil Tenenholtz

@ntenenz

4 months

The Dayhoff Atlas! Open code. Open weights. Open datasets. Thanks @huggingface for helping to facilitate open science. https://t.co/h0WRd1Wu3I @ClementDelangue @julien_c

huggingface.co

Kevin K. Yang 楊凱筌

@KevinKaichuang

4 months

Our models, code, and data are openly available on Github, Zenodo, and Huggingface. https://t.co/6l9iStfqDE https://t.co/T9mGmf2Pgd https://t.co/6X3VPQmJ7I

1

7

23

Neil Tenenholtz

@ntenenz

4 months

To the GPU-poor grad students out there, finding a better predictor of expression is one of the highest leverage contributions you could make to PLM research. Scale isn't always all you need.

Kyle Tretina, Ph.D.

@AllThingsApx

4 months

I was surprised to see that BackboneRef boosts Dayhoff‑170 m pLM generations expressed in E. coli 27.6% → 51.7%, 1.9× with zero filtering ...while common metrics (pLDDT, perplexity) failed to predict wet‑lab outcomes (AUROC ≤ 0.57) This quietly re‑prioritizes how we

0

4

19

Peter Lee

@peteratmsr

4 months

Today in @ScienceMagazine we present BioEmu1.1 @MSFTResearch. It rapidly and accurately emulates equilibrium distributions of protein dynamics at millisecond timescales. Code and datasets available on @Azure Foundry.

Microsoft Research

@MSFTResearch

4 months

Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://t.co/WwKjj5B0eb

5

28

168

Frank Noe

@FrankNoeBerlin

4 months

BioEmu now published in @ScienceMagazine !! What is BioEmu? Check out this video: https://t.co/PAj96iKvR7

Microsoft Research

@MSFTResearch

4 months

Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://t.co/WwKjj5B0eb

15

109

409

Neil Tenenholtz

@ntenenz

4 months

@ClementDelangue @julien_c 🤗 Collections are great, but limiting Papers to arxiv-only leaves out much of the sciences.

0

2

Neil Tenenholtz

@ntenenz

4 months

Raising the bat signal... @ClementDelangue @julien_c

Kevin K. Yang 楊凱筌

@KevinKaichuang

4 months

Getting ready for a release, and I'm kinda sad that @huggingface papers doesn't integrate with @biorxivpreprint

3

0

4