PM @pashadag@genomic.social @pashadag.bsky.social Profile
PM @[email protected] @pashadag.bsky.social

@pashadag

Followers
1K
Following
2K
Media
26
Statuses
2K

Algorithmic Bioinformatics Researcher and Teacher. Tweets about research results and educational/mentorship topics (for details, see https://t.co/FUl9nGstuH).

University Park, Pennsylvania
Joined April 2009
Don't wanna be here? Send us removal request.
@RayanChikhi
Rayan Chikhi
5 days
🌎👩‍🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵 Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open. https://t.co/dDBtAjfdYL
Tweet media one
5
145
362
@OrensteinYaron
Yaron Orenstein
2 months
🚨 Attention US PhDs in Bioinformatics: Interested in a postdoc in my group at Bar-Ilan University 🇮🇱 through the Fulbright Israel fellowship (2026–2027)? 🧬 2 years of research 💰 $95,000 stipend + relocation + benefits 📅 Apply by 15/9 🔗 https://t.co/iXoYWEEaO3 ✉️ Email me!
0
2
3
@AlessioCampa_
Alessio Campanelli
2 months
With @giulio_pibiri and @nomad421, we are about to release Fulgor v4.0.0, which is much faster to build and query, without affecting its memory efficiency. You can find all the information in our #WABI2025 paper ( https://t.co/foLYdhsC2Z). 🧵 (1/7)
1
7
12
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
4/n The traditional "repeat-oblivious" estimator can *overestimate mutation rates by an order of magnitude* on repetitive data. In contrast, the new estimator remains accurate across a broad range of rates and repetitive sequences (e.g. RBMY gene, α-satellite centromeres).
1
0
2
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
3/n Capturing the full repeat structure in the estimator is pretty hard and possibly not even needed. Instead, we account for the most pertinent part of the repeat structure in the estimator and the rest of the structure is accounted for in the bias formula.
1
0
2
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
2/n Tools such as Mash estimate the mutation rate via k-mer Jaccard similarity, assuming *non-repetitive* sequences. But in highly repetitive regions (e.g., α-satellite DNA), these estimates break down. We derive a novel estimator by relaxing the non-repetitive assumption.
1
0
2
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
🧵1/n Estimating mutation rates using k-mers is fast—but what happens when repeats dominate the genome? In a new preprint, @HaonanWu_1998, Antonio Blanca, and myself propose a *repeat-aware* estimator that's accurate even in centromeres.
@biorxiv_bioinfo
bioRxiv Bioinfo
3 months
A k-mer-based estimator of the substitution rate between repetitive sequences https://t.co/0HZxXkVWRv #biorxiv_bioinfo
1
14
24
@mourisl
Li Song
3 months
Our lab is hiring! We are looking for a postdoc in the area of immunology/microbiology+ML, or in pure method/software development. More information about our lab at:
mourisl.github.io
Li Song Lab
1
19
36
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
5/4 This is a draft manuscript and we hope to receive feedback from the community. You can submit a GitHub issue using https://t.co/m8QFDSHhFi or email the authors privately
0
1
2
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
4/4 We further clarify common misconceptions, e.g. the confusion between uniformity and regularity, the discrepancy between the original SimHash for vectors and the folklore version commonly used for estimating similarities among sets.
1
1
2
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
3/4 We propose a categorization of hashing methods based on their properties, design goals, and application context.
1
1
1
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
2/4 We provide a comprehensive overview of hash functions used in genomics. Hashing is central to many genomic tasks, but we found no good treatment that describes the wide variety of hash functions employed in these applications.
1
1
1
@pashadag
PM @[email protected] @pashadag.bsky.social
3 months
1/4 Hash functions in genomic sequence analysis ( https://t.co/6sP8TF6LxQ) : a new survey written together with @shaomingfu, @kanatos92, @xianglipsu, and Qian Shi. Before submitting it, we are posting it online to get feedback from the community.
Tweet card summary image
dropbox.com
Shared with Dropbox
1
19
40
@CBCB_UMD
CBCB_UMD
3 months
The Wonderful Algorithms in Bioinformatics (WABI) conference 2025 will be held @UofMaryland, College Park in August. WABI brings together researchers working on algorithmic aspects of computational biology. Learn more: https://t.co/jtITU2HeLn Register: https://t.co/ZybzovEWXN
0
3
4
@RayanChikhi
Rayan Chikhi
3 months
Slides from my talk (with Kamil Jaron) on an history of k-mers in bioinformatics:
1
31
87
@StevenSalzberg1
Steven Salzberg 💙💛
6 months
See our new paper on discovering extensive conservation of human introns across hundreds of other mammals, with Ilia Minkin
Tweet card summary image
academic.oup.com
Abstract. Despite many improvements over the years, the annotation of the human genome remains imperfect. The use of evolutionarily conserved sequences pro
1
26
91
@shenwei356
Wei Shen 沈 伟
6 months
Thrilled to finally debut my lab site 🎉: https://t.co/Z6kJpccFta — which stands for Microbial Bioinformatics, a domain I purchased 9 years ago and has been waiting for this moment ever since!"
Tweet media one
16
32
257
@chirgjain
Chirag Jain
6 months
The 3rd annual Symposium on Big Data Algorithms for Biology (BDBio) is happening this May! Join us for insightful talks and discussions on comp-bio topics Register, submit abstract, and see program details at https://t.co/j3Li1tca2i @iiscbangalore @cdsiisc @CBR_IISc
Tweet media one
0
8
14