
Yusuf Roohani
@yusufroohani
Followers
2K
Following
2K
Media
60
Statuses
356
Machine Learning & Systems Biology. ML Group Leader @arcinstitute. PhD @StanfordAILab
Palo Alto, CA
Joined May 2009
Cells are dynamic, messy and context dependent. Scaling models across diverse states needs flexibility to capture heterogeneity. Introducing State, a transformer that predicts perturbation effects by training over sets of cells. Team effort led by the unstoppable @abhinadduri
11
60
330
RT @zavaindar: the most impactful things you can do with your life right now. 1 accelerate ai.2 accelerate bio.3 end genocide.
0
2
0
If you're interested in putting the pieces together and building this out, then please reach out!. Consider applying as a research scientist: I also always have open positions for students on my team, just email me!.
job-boards.greenhouse.io
Palo Alto, CA
0
2
10
As I discussed with @ElliotHershberg earlier, we have a simple goal: make the existing models good enough that experimentalists adopt and use them. Like the “GPT Moment”, this may not require any semblance of perfection.
What Are Virtual Cells?. Two months ago, I started working on an essay to answer this question. The goal was to cover a few of the recent research results. Simple as that!. Instead, I went down a rabbit hole exploring ideas around cellular simulation
1
0
5
4. Finally, we launched the Virtual Cell Challenge, a first step toward formalizing progress on this vision and inviting the community to help chart the path forward. How do we measure progress? What are the key tasks, metrics, datasets, and modalities?.
How close are we to simulating living cells?. Today @arcinstitute is launching the Virtual Cell Challenge (VCC), an annual competition for evaluating progress towards a virtual cell. Read our @CellPressNews commentary
1
0
1
3. To enable scalable learning of cell behavior from these datasets, we developed State:. A transformer that predicts perturbation effects while accounting for cellular heterogeneity within and across experiments.
Cells are dynamic, messy and context dependent. Scaling models across diverse states needs flexibility to capture heterogeneity. Introducing State, a transformer that predicts perturbation effects by training over sets of cells. Team effort led by the unstoppable @abhinadduri
1
0
1
2. We built an AI agent to guide generation of new data thru designing perturbation experiments. BioDisoveryAgent outperforms baselines on detecting novel hits including TF combo screens. We further optimized the use case when helping build @ProjectBiomni.
Our work applying AI agents to the design of genetic perturbation experiments will be presented at ICLR 2025!🇸🇬. We explored the use of LLMs for autonomous design of biological experiments, achieving improved accuracy, interpretability and robustness over existing approaches.🧵
1
0
1
1. Using AI agents, we curated scBaseCount, the largest resource of single-cell RNAseq data. Uniform processing = reduced artifacts, more info (intronic reads, noncoding genes). Daily updates = we’ve grown by a LOT since our last release, stay tuned!.
At Arc we are building AI models of cell state from the ground up, rethinking every step, from data generation to biologically relevant evaluation. Today we launch scBaseCamp, the largest public repository of single cell RNAseq data, uniformly processed from raw sequencing reads.
1
0
1
The virtual cell is a longstanding vision, a tool to guide experiment design, understand function. My work uses AI to build a platform for engineering cell state, a capability needed to realize this vision. Come learn about our model development + AI-guided data generation 1/8🧵.
Following the amazing talk today by Emma Dann we are excited to host @yusufroohani of the @arcinstitute for a special seminar this Thursday @ 10AM, presenting the efforts towards a virtual cell platform🔮
2
19
142
RT @fleetwood___: Dropped the Virtual Cell Challenge Primer on HF. We are shipping transformers support for STATE (the SOTA model for pre….
0
8
0
RT @HannesStaerk: In 1h we discuss "Predicting cellular responses to perturbation across diverse contexts with State" from @arcinstitute in….
0
8
0
RT @KexinHuang5: 🧬 Excited to open-source Biomni! With just a few lines of code, you can now automate biomedical research with AI agent!. W….
0
101
0
RT @pdhsu: Our Virtual Cell Challenge commentary is one of the most read @CellCellPress articles for the last month, alongside Weinberg's c….
0
7
0
RT @abhinadduri: We updated the State Embedding 600M checkpoint on the @ArcInstitute Hugging Face. This model was trained with 4x FLOPs com….
0
8
0
RT @fleetwood___: Last week @arcinstitute released the Virtual Cell Challenge 🧬. The goal is to train a model capable of simulating a cell.….
0
37
0
“[We have] a much simpler goal: make the existing models good enough that experimentalists adopt and use them. Like the “GPT Moment,” this may not require any semblance of perfection.”. Was a pleasure speaking with Elliott! Great take on the field and how State is changing things.
What Are Virtual Cells?. Two months ago, I started working on an essay to answer this question. The goal was to cover a few of the recent research results. Simple as that!. Instead, I went down a rabbit hole exploring ideas around cellular simulation
3
2
25
RT @KexinHuang5: One-month update of Biomni⬇️.Excited to see how Biomni has automated 15K+ research tasks for biologists!.
0
8
0
Check out the implementation of these metrics and more in cell-eval.
github.com
Comprehensive suite for evaluating perturbation prediction models - ArcInstitute/cell-eval
0
1
7
Pert score measures how well a model can distinguish b/w perturbation effects. Using rank-based similarity, it measures how similar the predicted effect for a specific pert. is to the true effect. We adapted the original metric introduced by @yauwning et al. using an L1 distance.
1
0
2