Andrew Carroll @acarroll_ATG X Profile

Andrew Carroll

@acarroll_ATG

Followers

4K

Following

3K

Media

108

Statuses

1K

Product lead, Genomics @GoogleHealth.

Mountain View, CA

Joined April 2015

Don't wanna be here? Send us removal request.

Andrew Carroll

@acarroll_ATG

2 months

RT @HumanPangenome: 📢 HPRC Release 2 is here! . Now with phased genomes from 200+ individuals, a 5x increase from Release 1. Explore sequ….

0

20

0

Andrew Carroll

@acarroll_ATG

2 months

Also thanks to 20% contributors: Ben Soudry, Mike Kruskal, Sowmiya Nagarajan, Suchismita Tripathy, Francisco Unda, Vasiliy Strelnikov. And contributions from Sam Yadav and Seraj Ahmad at Roche improving the code for custom model training. DV Release Page.

0

2

Andrew Carroll

@acarroll_ATG

2 months

Release led by @kishwarshafin contribs by.@daniel_e_cook, @AlexeyKolesni18, Lucas Brambrink, .@pichuan engineering manager. Thanks to student researchers @FaricaZhuang, @MobinAsri . DeepSomatic Release Page (.

1

0

3

Andrew Carroll

@acarroll_ATG

2 months

Release of DeepVariant and DeepSomatic v1.9. DV: Added training on HG002 T2T-Q100. Error reduction of 12% for Illumina and 30% for PacBio on this truth set. 25% faster. DeepTrio is 5x faster (20h -> 4h). DS: New models FFPE_TUMOR_ONLY for {WGS, WES}. Much improved WGS models.

1

27

98

Andrew Carroll

@acarroll_ATG

4 months

Nice to see the impact of this important work. This benchmark set (v4.2.1) drove a huge amount of improvement sequencing. All seq instruments use it to quantify performance. Best credit is to @GenomeInABottle, who drove the field forward both in this work and over a decade.

Cell Press

@CellPressNews

4 months

Benchmarking challenging small variants with linked and long reads by @acarroll_ATG et al. (62 citations).Top-cited genomics research published in @CellGenomics

0

5

20

Andrew Carroll

@acarroll_ATG

5 months

Please also see Twitter summaries by main authors including @vivnat @alan_karthi . Again, Trusted Tester access form:. I hope this can become widely available and easy to use for the scientific community.

Alan Karthikesalingam

@alan_karthi

5 months

Delighted to share our "AI co-scientist" - a multi-agent system built with Gemini-2.0 designed to be a helpful collaborator for researchers and accelerate scientific breakthroughs (especially in biomedicine!) . New ideas that represent leaps forward in science combine human

0

4

Andrew Carroll

@acarroll_ATG

5 months

The paper then shows use of the system by collaborating external groups who did experimental validation for various applications. I encourage you to read the full paper for those descriptions Or the Google Research blog summary

1

0

4

Andrew Carroll

@acarroll_ATG

5 months

To measure if the ELO tournaments are truly selecting better ideas, a similar process can be applied to external benchmark questions. ELO score correlates with the performance on GPQA, a set of graduate level questions. The best answers improve the GPQA score above the base model

1

0

1

Andrew Carroll

@acarroll_ATG

5 months

One core concept is idea tournaments - Gemini models compare two research ideas to each other, identifying strengths, weaknesses, and a preference between the two. Ideas are given an ELO ranking and are iteratively refined. The best ideas can surpass thinking models.

1

0

2

Andrew Carroll

@acarroll_ATG

5 months

Co-scientist extends the concepts of agents and test-time compute. From a natural language query asking for a research goal, a set of agents perform background research, identify limits in current knowledge, propose research directions, evaluate those, and draft experiment plans.

1

3

Andrew Carroll

@acarroll_ATG

5 months

I've been impressed as a user of this system, and I am eager for other scientists to get their hands on it and see feedback on the system and how to improve it. There's a form to apply for Trusted Tester access here - A few notes from the paper.

Ethan Mollick

@emollick

5 months

We are starting to see what "AI will accelerate science" actually looks like. This Google paper describes novel discoveries being made by AI working with human co-scientists (something I think we have all been waiting to see), along with an early version of an AI scientist.

3

10

55

Andrew Carroll

@acarroll_ATG

5 months

RT @vivnat: Accelerating scientific discoveries and helping cure diseases might be the most profound purpose of AI. Thrilled to introduce….

0

35

0

Andrew Carroll

@acarroll_ATG

7 months

Release led by DeepVariant tech lead @kishwarshafin. Team Engineering manager @pichuan. Small model work led by Lucas Brambrink. Pangenome-aware led by Mobin Asri and Juan Carlos Mier. Fast pipeline by @AlexeyKolesni18. Mas-Seq model by @daniel_e_cook and Shiyi Yin from Verily.

0

4

Andrew Carroll

@acarroll_ATG

7 months

Added SPRQ to PacBio training, reducing Indel error on SPRQ by 26%. Added Platinum Pedigree training data for PacBio model, reducing errors by 34% on more extensive Platinum truth. New model and case study for Kinnex/Mas-Seq/Iso-Seq. Additional speed options for GPU pipelines 2/3

1

0

5

Andrew Carroll

@acarroll_ATG

7 months

Release of DeepVariant 1.8. Large speed improvement (~67% faster) via small model for easy sites. New Pangenome-aware option. Reduces error by ~30% for vg-mapped WGS ~10% for BWA WGS ~5% BWA exome. New config for custom model users, see release notes 1/3.

1

57

162

Andrew Carroll

@acarroll_ATG

9 months

Huge thanks to. Yuchen Zhou and Atilla Kiraly who drove these investigations and did all of the work. @_beenkim, an explainability expert, and DeepVariant eng manager @pichuan who directed the team. @marianattestad who started the explainability work with @_beenkim.

0

1

4

Andrew Carroll

@acarroll_ATG

9 months

In training, DeepVariant never sees annotations about what a mosaic variant is or what a LINE is. That these are identifiable in the embeddings space implies concepts relevant to them are learned. It might be possible to give information about embeddings to help interpret info.

1

3

9

Andrew Carroll

@acarroll_ATG

9 months

In another cluster, we realized that the windows seemed to be mostly in LINEs based on the genome tracks. We used the UCSC genome annotations to label all examples by whether they overlap LINEs. We observe two clusters that are ~80% and ~70% LINEs.

1

0

4

Andrew Carroll

@acarroll_ATG

9 months

We realized they could be mosaic variant positions. Fortunately, NIST @GenomeInABottle released a set of annotated mosaics in the cell lines we were looking at. Plotting those examples revealed they overlap the cluster we inspected.

1

2

Andrew Carroll

@acarroll_ATG

9 months

Looking at those clusters revealed interesting patterns. We could see lowMAPQ, copynumber=3 examples. We narrowed investigation on a few. One consistently had examples which looked like below. This didn’t seem too hard - it looks HET but unlucky to sample the ALT allele.

1