kishwarshafin Profile Banner
Kishwar Profile
Kishwar

@kishwarshafin

Followers
1K
Following
987
Media
94
Statuses
557

@kishwar.bsky.social Research Scientist @Google. Interested in ML in genomics. @ucsc alumnus. 🇧🇩 🇺🇸

Mountain View, CA
Joined May 2016
Don't wanna be here? Send us removal request.
@kishwarshafin
Kishwar
2 years
It’s been an honor talking about our contributions to the pangenome project. Attached youtube video has the full story. Also this was a full circle moment for me.
@GoogleAI
Google AI
2 years
Google Researchers worked closely with sciences in the Human Pangenome project to improve the pangenome using Google Research's DeepVariant and DeepConsensus methods, which use deep learning to improve the quality of genomics data. Learn more → https://t.co/qMi3xXiwnV
0
4
50
@sundarpichai
Sundar Pichai
1 month
5/ Finally - some exciting progress applying AI in cancer research. Our C2S-Scale 27B foundation model, built with @Yale and Gemma, generated a novel hypothesis about cancer cellular behavior that was validated in living cells and we’ve released the model on GitHub and
21
34
349
@NatureBiotech
Nature Biotechnology
1 month
Accurate somatic small variant discovery for multiple sequencing technologies with DeepSomatic https://t.co/i0nwOl1xBy
2
22
66
@GoogleResearch
Google Research
1 month
DeepSomatic, an AI model developed with @ucscgenomics, identifies cancer cell genetic variants. In research with Children’s Mercy, it found 10 variants in pediatric leukemia cells missed by other tools. DeepSomatic & the CASTLE dataset are now available: https://t.co/m0NVEdkam1
19
120
555
@GoogleResearch
Google Research
1 month
What if we could use AI to identify genetic variants in cancer cells? DeepSomatic uses ML to find key DNA variants with higher accuracy, aiming to create a better understanding of cancer’s underpinnings and support partners in designing more effective, personalized treatments.
12
46
279
@GoogleResearch
Google Research
1 month
For 10 years, Google has worked to accurately read the operating manual of all life on Earth — the genome. Our AI tools are now used by partners for real-world challenges from improving healthcare to biodiversity conservation. Check out the key milestones ↓
39
289
2K
@kishwarshafin
Kishwar
1 month
DeepSomatic is out today in @NatureBiotech, demonstrating outstanding accuracy in all major sequencing platforms for somatic variant detection. Incredible collaboration between @GoogleResearch, @ucscgenomics, @ChildrensMercy, @NCICancerCtrl etc. https://t.co/ab7N4uuhRI
@Google
Google
1 month
Today, @GoogleResearch announced DeepSomatic, a new machine learning model developed with our partners, including @ucscgenomics and @ChildrensMercy, that accurately identifies genetic variants in cancer cells — a critical step for delivering more precise treatments for patients.
0
4
18
@acarroll_ATG
Andrew Carroll
6 months
Release of DeepVariant and DeepSomatic v1.9 DV: Added training on HG002 T2T-Q100. Error reduction of 12% for Illumina and 30% for PacBio on this truth set. 25% faster. DeepTrio is 5x faster (20h -> 4h). DS: New models FFPE_TUMOR_ONLY for {WGS, WES}. Much improved WGS models.
1
26
93
@kishwarshafin
Kishwar
1 year
DeepVariant 1.8 is out. Pangenome is here and it just got ~50% faster.
@acarroll_ATG
Andrew Carroll
1 year
Release of DeepVariant 1.8. Large speed improvement (~67% faster) via small model for easy sites. New Pangenome-aware option. Reduces error by ~30% for vg-mapped WGS ~10% for BWA WGS ~5% BWA exome. New config for custom model users, see release notes 1/3 https://t.co/TQmQElqAOR
0
12
77
@kishwarshafin
Kishwar
1 year
Say hi if you see me.
0
0
10
@pichuan
Pi-Chuan Chang
1 year
Initiated in 2023 by @marianattestad and @_beenkim, this project has been a collaborative effort. For over a year, Atilla and Yuchen dedicated 20% of their time to working with @_beenkim , @acarroll_ATG and myself on this project. It has been a fun exploration!
@GoogleAI
Google AI
1 year
When we train deep learning models for genomics, what do they learn? To help answer this question, we examined the DeepVariant model to determine what insights it has developed, and we discovered some surprising concepts embedded within. Read more at https://t.co/SSUh4EdJVo
2
4
13
@kishwarshafin
Kishwar
1 year
Transformer-based polishing approach DeepPolisher's manuscript is now live. @miramastoras polished 180 assemblies using DeepPolisher for the next human pangenome release. Collaboration with @BenedictPaten @MobinAsri. PM @acarroll_ATG and eng mng @pichuan. https://t.co/lOmGI29ai1
Tweet card summary image
biorxiv.org
Accurate genome assemblies are essential for biological research, but even the highest quality assemblies retain errors caused by the technologies used to construct them. Base-level errors are...
0
16
38
@acarroll_ATG
Andrew Carroll
1 year
Release of DeepSomatic v1.7 ( https://t.co/YpB46RlP89). Now supports tumor-only applications, now supports FFPE-prepared samples with specific models. New models to support exome and ONT. Improved accuracy and runtime improvements.
Tweet card summary image
github.com
DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal and tumor-only sequencing data. - google/deepsomatic
1
28
96
@kishwarshafin
Kishwar
1 year
Finally, all of the models and improvements will be available in the next official DeepSomatic release with the documentation and a link to the manuscript. Currently, you can use the docker mentioned in the manuscript to test the models. 🧵7/7
0
0
0
@kishwarshafin
Kishwar
1 year
Finally, we also extended DeepSomatic's ability to call variants with tumor-only data from WGS, PacBio and ONT sequencing. We also extended to FFPE_WGS and FFPE_WES for tumor-normal variant calling. 🧵6/7
1
1
2
@kishwarshafin
Kishwar
1 year
We re-trained DeepSomatic with all cell lines and with data varying tumor-normal purities. The new models show significant improvements against the orthogonal truth sets generated to verify the improvement. 🧵5/7
1
0
0
@kishwarshafin
Kishwar
1 year
However, the lack of training set in somatic space is a true bottleneck. @jiimiinpaark lead the work to develop a training set with five tumor-normal cell lines using three sequencing technologies. Massive effort to have more data in this space. All data is public. 🧵4/7
1
0
0
@kishwarshafin
Kishwar
1 year
We trained our initial model on SEQC2 data ( https://t.co/DiIGaC4MaW) and showed DeepSomatic performs better than existing somatic variant callers. 🧵3/7
1
0
0
@kishwarshafin
Kishwar
1 year
We developed DeepSomatic by making significant changes to the DeepVariant framework. Instead of calling germline variants with genotypes, we trained the models to classify somatic, germline or reference by representing both tumor-normal reads in the example. 🧵2/7
1
0
0
@kishwarshafin
Kishwar
1 year
DeepSomatic ( https://t.co/BqZkDNMv1E) preprint is out showing improvements in somatic variant calling in various platforms. Work lead by @jiimiinpaark and @daniel_e_cook. Tumor-only lead by @pichuan. In collaboration with @acarroll_ATG, @MishaKolmogorov and @BenedictPaten. 🧵1/7
Tweet card summary image
github.com
DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal and tumor-only sequencing data. - google/deepsomatic
@biorxiv_bioinfo
bioRxiv Bioinfo
1 year
DeepSomatic: Accurate somatic small variant discovery for multiple sequencing technologies https://t.co/l42tuwqoCh #biorxiv_bioinfo
1
18
58
@acarroll_ATG
Andrew Carroll
1 year
Our fast haplotagging paper is now published ( https://t.co/QcCE19MrQn) - see the earlier thread for a summary. Thanks to @kishwarshafin @AlexeyKolesni18 for leading the work and collaborators: @Johngorzynski, @gsneha261, @euanashley, @mitenjain, @khmiga, @BenedictPaten
Tweet card summary image
nature.com
Nature Communications - DNA variant calling methods based on deep neural networks can use local haplotyping information with long-reads to improve genotyping accuracy, however this increases...
@acarroll_ATG
Andrew Carroll
2 years
Happy to share this paper which provides more detail about the process we use to assign haplotypes to long reads on-the-fly, which enabled us to speed up the DeepVariant release at v1.4. Implementation by Alexey Kolesnikov, Lead for collaboration, writing, figures @kishwarshafin
0
13
43