 
            
              Luke O'Connor
            
            @Luke0connor
Followers
                2K
              Following
                4K
              Media
                21
              Statuses
                864
              Statistical genetics and applied mathematics - Genetic architecture and methods development - Assistant Professor @HarvardDBMI
              
              Cambridge, MA
            
            
              
              Joined April 2016
            
            
           We are excited to share GPN-Star, a cost-effective, biologically grounded genomic language modeling framework that achieves state-of-the-art performance across a wide range of variant effect prediction tasks relevant to human genetics.  https://t.co/FTm3byYp67  (1/n) 
          
                
                17
              
              
                
                160
              
              
                
                514
              
             This project was great fun. Thanks to co-authors and especially to Heng, who co-supervised the project. 
          
                
                0
              
              
                
                0
              
              
                
                6
              
             In the HPRC graph, we find millions of 'non-GRCh38' variants which are difficult to detect using existing approaches. Many localize to segdup regions that are often functional. 
          
                
                1
              
              
                
                0
              
              
                
                6
              
             Remaining edges are 'variant edges'. Different choices of reference tree are possible - mathematically, this is a choice of basis 
          
                
                1
              
              
                
                0
              
              
                
                4
              
             Our approach is to define a 'reference tree', which is a spanning tree of the pangenome graph; it includes all nodes (sequences) but only a subset of edges 
          
                
                1
              
              
                
                1
              
              
                
                7
              
             A pangenome could improve variant discovery by cataloguing differences between non-reference sequences as well, but it is not obvious how to formalize this intuition 
          
                
                1
              
              
                
                0
              
              
                
                6
              
             With a single reference genome, it is clear what a 'variant' is - a difference vs. the reference 
          
                
                1
              
              
                
                0
              
              
                
                6
              
             New preprint on a surprising question - with a pangenome reference, *what is a genetic variant?*  https://t.co/3CivztB04v  With Pouria Salehi Nowbandani, Shenghan Zhang, Haoyang Hu, and Heng Li @lh3lh3
          
          
            
            biorxiv.org
              Structural variation causes some human haplotypes to align poorly with the linear reference genome, leading to ‘reference bias’. A pangenome reference graph could ameliorate this bias by relating a...
            
                
                2
              
              
                
                55
              
              
                
                219
              
             excited to share our new preprint looking at mosaic chromosomal alterations in blood whole genome sequencing data! i learned tons working on this project, and i hope our findings are of interest to those thinking about CH, somatic mosaicism, and genetics.  https://t.co/HUgGESfBH8 
          
          
            
            medrxiv.org
              Clonal expansions of hematopoietic cells carrying mosaic chromosomal alterations (mCAs) are commonly detectable in elderly individuals. Here, we studied 43,617 autosomal mCAs that we ascertained in...
            
                
                3
              
              
                
                13
              
              
                
                42
              
             It's unusual to write a statgen paper whose main contribution is a definition, as opposed to a finding or a method - but I think we should pay more attention to definitions and their justification. 
          
                
                0
              
              
                
                0
              
              
                
                11
              
             We estimate three of these measures across 36 traits using an existing method (FMR). Depending what measure you choose, values range 50-500 or 5k-100k. 
          
                
                1
              
              
                
                0
              
              
                
                5
              
             We provide five mathematical properties which are definitional - any function satisfying these properties fits our definition 
          
                
                1
              
              
                
                0
              
              
                
                5
              
             We propose a mathematical definition encompassing many specific measures, akin to the many measures of 'mean' (arithmetic, geometric, ...) 
          
                
                1
              
              
                
                0
              
              
                
                4
              
             New preprint with Guy Sella on the question: what is polygenicity?  https://t.co/eYiVfJS3tU 
          
          
            
            biorxiv.org
              The ‘polygenicity’ of traits is often invoked and sometimes quantified in quantitative, statistical, and human genetics. What do we mean by the polygenicity of a trait? We propose a principled...
            
                
                3
              
              
                
                18
              
              
                
                83
              
             Excited to share that Ajay's paper is now out @NatureGenet : Transcriptome-wide analysis of differential expression in perturbation atlases  https://t.co/1PaN98PD7M 
          
           How do genetic perturbations change cells? How are these effects shaped by cell type and dosage? How do we best extract insight from modern massive perturbation atlases? Im pleased to share a new preprint where we develop a suite of statistical approaches to these Qs (link below) 
            
                
                1
              
              
                
                22
              
              
                
                105
              
             Congrats @NadigAjay on TRADE out now in @NatureGenet:  https://t.co/eHwDZEIqNd  These statistical metrics enable more meaningful comparisons in Perturb-seq atlases. Also, now find the HepG2 and Jurkat Perturb-seq datasets on GEO GSE264667! 
          
            
            nature.com
              Nature Genetics - Transcriptome-wide analysis of differential expression (TRADE) is a broadly applicable tool for characterizing patterns of differential expression across the genome.
            
                
                0
              
              
                
                4
              
              
                
                32
              
             The university will not surrender its independence or relinquish its constitutional rights. Neither Harvard nor any other private university can allow itself to be taken over by the federal government. 
          
                
                11K
              
              
                
                15K
              
              
                
                82K
              
             Excited to share our preprint: Cohort-level analysis of human de novo mutations points to drivers of clonal expansion in spermatogonia! We developed methods to uncover drivers of clonal expansions in sperm (CES) using 55k disease trios & gnomAD SNV data.  https://t.co/M5Ojbdn4Ya 
          
          
            
            medrxiv.org
              In renewing tissues, mutations conferring selective advantage may result in clonal expansions[1][1]–[3][2]. In contrast to somatic tissues, mutations driving clonal expansions in spermatogonia (CES)...
            
                
                1
              
              
                
                9
              
              
                
                24
              
             Super excited to release this new preprint: Jeff and Hakhamanesh drill into key questions about GWAS and rare variant studies: What SNPs and genes do these discover and why? We introduce a concept called SPECIFICITY, which we show is a fundamental determinant of GWAS/RV studies 
          
                
                0
              
              
                
                20
              
              
                
                83
              
             Excited to share our recent work: expansions and contractions of DNA repeats have produced many genetic polymorphisms. We studied repeat instability among >700,000 biobank participants using new computational approaches to analyze DNA sequencing data. 
          
            
            biorxiv.org
              Expansions and contractions of tandem DNA repeats are a source of genetic variation in human populations and in human tissues: some expanded repeats cause inherited disorders, and some are also...
            
                
                4
              
              
                
                26
              
              
                
                119
              
             
             
             
               
             
             
             
            