Andrew Carroll Profile
Andrew Carroll

@acarroll_ATG

Followers
4K
Following
3K
Media
108
Statuses
1K

Product lead, Genomics @GoogleHealth.

Mountain View, CA
Joined April 2015
Don't wanna be here? Send us removal request.
@acarroll_ATG
Andrew Carroll
2 months
RT @HumanPangenome: 📢 HPRC Release 2 is here! . Now with phased genomes from 200+ individuals, a 5x increase from Release 1. Explore sequ….
0
20
0
@acarroll_ATG
Andrew Carroll
2 months
Also thanks to 20% contributors: Ben Soudry, Mike Kruskal, Sowmiya Nagarajan, Suchismita Tripathy, Francisco Unda, Vasiliy Strelnikov. And contributions from Sam Yadav and Seraj Ahmad at Roche improving the code for custom model training. DV Release Page.
0
0
2
@acarroll_ATG
Andrew Carroll
2 months
Release led by @kishwarshafin contribs by.@daniel_e_cook, @AlexeyKolesni18, Lucas Brambrink, .@pichuan engineering manager. Thanks to student researchers @FaricaZhuang, @MobinAsri . DeepSomatic Release Page (.
1
0
3
@acarroll_ATG
Andrew Carroll
2 months
Release of DeepVariant and DeepSomatic v1.9. DV: Added training on HG002 T2T-Q100. Error reduction of 12% for Illumina and 30% for PacBio on this truth set. 25% faster. DeepTrio is 5x faster (20h -> 4h). DS: New models FFPE_TUMOR_ONLY for {WGS, WES}. Much improved WGS models.
Tweet media one
1
27
98
@acarroll_ATG
Andrew Carroll
4 months
Nice to see the impact of this important work. This benchmark set (v4.2.1) drove a huge amount of improvement sequencing. All seq instruments use it to quantify performance. Best credit is to @GenomeInABottle, who drove the field forward both in this work and over a decade.
@CellPressNews
Cell Press
4 months
Benchmarking challenging small variants with linked and long reads by @acarroll_ATG et al. (62 citations).Top-cited genomics research published in @CellGenomics
Tweet media one
0
5
20
@acarroll_ATG
Andrew Carroll
5 months
Please also see Twitter summaries by main authors including @vivnat @alan_karthi . Again, Trusted Tester access form:. I hope this can become widely available and easy to use for the scientific community.
@alan_karthi
Alan Karthikesalingam
5 months
Delighted to share our "AI co-scientist" - a multi-agent system built with Gemini-2.0 designed to be a helpful collaborator for researchers and accelerate scientific breakthroughs (especially in biomedicine!) . New ideas that represent leaps forward in science combine human
0
0
4
@acarroll_ATG
Andrew Carroll
5 months
The paper then shows use of the system by collaborating external groups who did experimental validation for various applications. I encourage you to read the full paper for those descriptions Or the Google Research blog summary
1
0
4
@acarroll_ATG
Andrew Carroll
5 months
To measure if the ELO tournaments are truly selecting better ideas, a similar process can be applied to external benchmark questions. ELO score correlates with the performance on GPQA, a set of graduate level questions. The best answers improve the GPQA score above the base model
Tweet media one
1
0
1
@acarroll_ATG
Andrew Carroll
5 months
One core concept is idea tournaments - Gemini models compare two research ideas to each other, identifying strengths, weaknesses, and a preference between the two. Ideas are given an ELO ranking and are iteratively refined. The best ideas can surpass thinking models.
Tweet media one
1
0
2
@acarroll_ATG
Andrew Carroll
5 months
Co-scientist extends the concepts of agents and test-time compute. From a natural language query asking for a research goal, a set of agents perform background research, identify limits in current knowledge, propose research directions, evaluate those, and draft experiment plans.
1
1
3
@acarroll_ATG
Andrew Carroll
5 months
I've been impressed as a user of this system, and I am eager for other scientists to get their hands on it and see feedback on the system and how to improve it. There's a form to apply for Trusted Tester access here - A few notes from the paper.
@emollick
Ethan Mollick
5 months
We are starting to see what "AI will accelerate science" actually looks like. This Google paper describes novel discoveries being made by AI working with human co-scientists (something I think we have all been waiting to see), along with an early version of an AI scientist.
Tweet media one
Tweet media two
3
10
55
@acarroll_ATG
Andrew Carroll
5 months
RT @vivnat: Accelerating scientific discoveries and helping cure diseases might be the most profound purpose of AI. Thrilled to introduce….
0
35
0
@acarroll_ATG
Andrew Carroll
7 months
Release led by DeepVariant tech lead @kishwarshafin. Team Engineering manager @pichuan. Small model work led by Lucas Brambrink. Pangenome-aware led by Mobin Asri and Juan Carlos Mier. Fast pipeline by @AlexeyKolesni18. Mas-Seq model by @daniel_e_cook and Shiyi Yin from Verily.
0
0
4
@acarroll_ATG
Andrew Carroll
7 months
Added SPRQ to PacBio training, reducing Indel error on SPRQ by 26%. Added Platinum Pedigree training data for PacBio model, reducing errors by 34% on more extensive Platinum truth. New model and case study for Kinnex/Mas-Seq/Iso-Seq. Additional speed options for GPU pipelines 2/3
Tweet media one
1
0
5
@acarroll_ATG
Andrew Carroll
7 months
Release of DeepVariant 1.8. Large speed improvement (~67% faster) via small model for easy sites. New Pangenome-aware option. Reduces error by ~30% for vg-mapped WGS ~10% for BWA WGS ~5% BWA exome. New config for custom model users, see release notes 1/3.
Tweet media one
1
57
162
@acarroll_ATG
Andrew Carroll
9 months
Huge thanks to. Yuchen Zhou and Atilla Kiraly who drove these investigations and did all of the work. @_beenkim, an explainability expert, and DeepVariant eng manager @pichuan who directed the team. @marianattestad who started the explainability work with @_beenkim.
0
1
4
@acarroll_ATG
Andrew Carroll
9 months
In training, DeepVariant never sees annotations about what a mosaic variant is or what a LINE is. That these are identifiable in the embeddings space implies concepts relevant to them are learned. It might be possible to give information about embeddings to help interpret info.
1
3
9
@acarroll_ATG
Andrew Carroll
9 months
In another cluster, we realized that the windows seemed to be mostly in LINEs based on the genome tracks. We used the UCSC genome annotations to label all examples by whether they overlap LINEs. We observe two clusters that are ~80% and ~70% LINEs.
Tweet media one
1
0
4
@acarroll_ATG
Andrew Carroll
9 months
We realized they could be mosaic variant positions. Fortunately, NIST @GenomeInABottle released a set of annotated mosaics in the cell lines we were looking at. Plotting those examples revealed they overlap the cluster we inspected.
Tweet media one
1
2
2
@acarroll_ATG
Andrew Carroll
9 months
Looking at those clusters revealed interesting patterns. We could see lowMAPQ, copynumber=3 examples. We narrowed investigation on a few. One consistently had examples which looked like below. This didn’t seem too hard - it looks HET but unlucky to sample the ALT allele.
Tweet media one
1
1
1