simocristea Profile Banner
Simona Cristea Profile
Simona Cristea

@simocristea

Followers
9K
Following
6K
Media
584
Statuses
5K

cancer genomics AI scientist; head of Data Science & AI and group leader @DanaFarber_Hale; research scientist @Harvard; phd @eth.🇷🇴🇸🇪🇨🇭🇺🇸

Boston 🇺🇸 & Zurich🇨🇭
Joined January 2016
Don't wanna be here? Send us removal request.
@simocristea
Simona Cristea
2 months
scRNAseq cell type annotation is notoriously messy. Despite so many algorithms, most researchers still rely on manual annotations using marker genes. In a new preprint accepted at ICML GenAI Bio Workshop, we ask if reasoning LLMs (DeepSeek-R1) can help with cell type annotation🧵
Tweet media one
6
43
201
@simocristea
Simona Cristea
25 days
just realized that most deep domain experts are very similar to reasoning LLMs: very knowledgable, most of the times right in their judgement, sometimes wrong, but unaware/unwilling to admit they are wrong 🤯.
1
2
13
@grok
Grok
1 day
Join millions who have switched to Grok.
95
173
1K
@simocristea
Simona Cristea
1 month
and they don’t like eachother.
@francoisfleuret
François Fleuret
1 month
The AI field is now split into (A) a "traditional" ml/dl domain, and (B) a "psycho-AI" domain where innovation requires an understanding of / intuition about the cognitive capabilities of pretrained models and how to prompt / fine-tune them. These two fields are IMO separated.
0
0
3
@simocristea
Simona Cristea
1 month
first mover advantage 101
Tweet media one
0
0
7
@simocristea
Simona Cristea
1 month
what's the best AI agent for science right now?.
9
1
14
@simocristea
Simona Cristea
1 month
RL scaling is meant to hit a wall in real-world tasks because such tasks are designed to be done by many people together & include lots of redundancies & inefficiencies. anybody who worked among many people knows that it’s seldom the “best” (by a clear metric) solution that wins.
0
0
4
@simocristea
Simona Cristea
2 months
extremely excited to share my lab's research of AI for cancer genomics at the neurIPS 2025 workshop on multi-modal foundation models in san diego.
@cmuptx
Pengtao Xie
2 months
We are excited to organize NeurIPS 2025 2nd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences. The workshop features a stellar lineup of invited speakers, including Ziv Bar-Joseph, Charlotte Bunne @_bunnech , Simona Cristea @simocristea ,
Tweet media one
Tweet media two
1
1
28
@simocristea
Simona Cristea
2 months
This R packages can transform any plot using color pallets from painting at the Museum of Modern Art in New York.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
43
181
@simocristea
Simona Cristea
2 months
I disagree with lots of what François (used to) say, but I think he's super right here.
@fchollet
François Chollet
2 months
In order to supervise an automation tool (or another person!) effectively, you need to be able to do the same job yourself. Doesn't matter if you rarely ever do the job yourself (like a manager who no longer codes), you need to be *able* to do it.
0
0
2
@simocristea
Simona Cristea
2 months
computational biology/bio ML has changed more in the past 2 years than in the last 10. “i have experience with CNNs for images, can I get an AI research job” doesn’t work anymore. if you’re trying to solve a pure prediction problem, you need to constantly know SOTA AI methodology.
0
2
24
@simocristea
Simona Cristea
2 months
I might get this wrong, but why on Earth do people use chatGPT to generate “in-depth” blog posts on a technical topic? Doesn’t this literally beat the purpose? Like what is the point, what value does this bring to this world, what can one learn/gain from this? I don’t get this.
2
1
13
@simocristea
Simona Cristea
2 months
7. as more & more scRNAseq data will be generated, we believe that general-purpose reasoning LLMs s.a DeepSeek-R1 can be reliably employed for automating cell type annotation & also multiple other scRNAseq analysis tasks, streamlining these analyses across the scRNAseq community.
Tweet media one
2
0
4
@simocristea
Simona Cristea
2 months
5. reasoning LLMs are better annotators than non-reasoning LLMs (DeepSeek-V3, GPT4o). 6. overall, the best performing models on single-tissue real-world situations were the DeepSeek-R1 classifiers, especially when prompted with scGPT's large list of labels to choose from.
1
2
4
@simocristea
Simona Cristea
2 months
4. reasoning LLMs (DeepSeek-R1-0528) are competitive & interpretable cell type annotators. Their performance is sometimes higher & sometimes lower than that of alternative tools. But, on most real-life scenarios & especially on less standard cells, they are as good or even better.
1
0
2
@simocristea
Simona Cristea
2 months
Putting everything together, we found that:.1. benchmarking cell type annotation tools is inherently difficult. 2. cell typing should be both accurate, but also prone to discovery. 3. regardless of the approach - some cells are really easy to annotate, while others are difficult.
1
1
3
@simocristea
Simona Cristea
2 months
We tested 5 more tissues:.- lung: highest accuracy for scTab & Macro-F1 for regular DeepSeek-R1.- cerebellum: highest accuracy & Macro-F1 for DeepSeek-R1.- trachea & kidney: highest accuracy & Macro-F1 for scTab.- peripheral retina: highest accuracy for R1 with scGPT classifier
Tweet media one
1
0
1
@simocristea
Simona Cristea
2 months
On breast cells, DeepSeek-R1 in scGPT classifier mode was most accurate, while scTab had highest Macro-F1. Regular DeepSeek-R1 was particularly poor in accuracy (2.9%), partially due to granularity issues or mistaking the basal & luminal lineages in cell typing.
Tweet media one
1
0
1
@simocristea
Simona Cristea
2 months
For pancreatic cells, DeepSeek-R1 classifier with access to scGPT labels had 80.9% accuracy (50.7% for regular DeepSeek-R1), while scGPT's accuracy alone was only 3.4%. This further suggests that it is hard for expert models to generalize, while LLMs are highly adaptable.
Tweet media one
1
1
3
@simocristea
Simona Cristea
2 months
In contrast, when given specific indications to choose from either scGPT’s or scTab’s labels, DeepSeek-R1 in classifier mode correctly labeled a much larger fraction of blood cells, with even greater granularity than the ground truth data, providing additional biological context.
1
1
2
@simocristea
Simona Cristea
2 months
In blood, all models except DeepSeek-R1 & its classifiers struggled with hitting the granularity level of the ground-truth data (e.g predicted “leukocyte” for the ground-truth “naïve B cell”), suggesting generalization challenges on novel datasets, despite lots of training data.
1
0
1
@simocristea
Simona Cristea
2 months
Diving into tissue-specific metrics:. For blood cells, the two DeepSeek-R1 classifiers had highest accuracy & Macro-F1, while scTab & scGPT performed poorly, despite almost complete cell type label overlap (99.92%) with scTab & blood the most frequent tissue in scTab’s training.
Tweet media one
1
0
1