
PM @[email protected] @pashadag.bsky.social
@pashadag
Followers
1K
Following
2K
Media
26
Statuses
2K
Algorithmic Bioinformatics Researcher and Teacher. Tweets about research results and educational/mentorship topics (for details, see https://t.co/FUl9nGstuH).
University Park, Pennsylvania
Joined April 2009
🌎👩🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵 Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open. https://t.co/dDBtAjfdYL
5
145
362
🚨 Attention US PhDs in Bioinformatics: Interested in a postdoc in my group at Bar-Ilan University 🇮🇱 through the Fulbright Israel fellowship (2026–2027)? 🧬 2 years of research 💰 $95,000 stipend + relocation + benefits 📅 Apply by 15/9 🔗 https://t.co/iXoYWEEaO3 ✉️ Email me!
0
2
3
With @giulio_pibiri and @nomad421, we are about to release Fulgor v4.0.0, which is much faster to build and query, without affecting its memory efficiency. You can find all the information in our #WABI2025 paper ( https://t.co/foLYdhsC2Z). 🧵 (1/7)
1
7
12
5/n The code and paper are available: 🔗 paper: https://t.co/79Ety1S3Uk 🔗 code:
github.com
This tool estimates substitution rate based on k-spectrum - medvedevgroup/Repeat-Aware_Substitution_Rate_Estimator
0
0
6
4/n The traditional "repeat-oblivious" estimator can *overestimate mutation rates by an order of magnitude* on repetitive data. In contrast, the new estimator remains accurate across a broad range of rates and repetitive sequences (e.g. RBMY gene, α-satellite centromeres).
1
0
2
3/n Capturing the full repeat structure in the estimator is pretty hard and possibly not even needed. Instead, we account for the most pertinent part of the repeat structure in the estimator and the rest of the structure is accounted for in the bias formula.
1
0
2
2/n Tools such as Mash estimate the mutation rate via k-mer Jaccard similarity, assuming *non-repetitive* sequences. But in highly repetitive regions (e.g., α-satellite DNA), these estimates break down. We derive a novel estimator by relaxing the non-repetitive assumption.
1
0
2
🧵1/n Estimating mutation rates using k-mers is fast—but what happens when repeats dominate the genome? In a new preprint, @HaonanWu_1998, Antonio Blanca, and myself propose a *repeat-aware* estimator that's accurate even in centromeres.
A k-mer-based estimator of the substitution rate between repetitive sequences https://t.co/0HZxXkVWRv
#biorxiv_bioinfo
1
14
24
Our lab is hiring! We are looking for a postdoc in the area of immunology/microbiology+ML, or in pure method/software development. More information about our lab at:
mourisl.github.io
Li Song Lab
1
19
36
5/4 This is a draft manuscript and we hope to receive feedback from the community. You can submit a GitHub issue using https://t.co/m8QFDSHhFi or email the authors privately
0
1
2
4/4 We further clarify common misconceptions, e.g. the confusion between uniformity and regularity, the discrepancy between the original SimHash for vectors and the folklore version commonly used for estimating similarities among sets.
1
1
2
3/4 We propose a categorization of hashing methods based on their properties, design goals, and application context.
1
1
1
2/4 We provide a comprehensive overview of hash functions used in genomics. Hashing is central to many genomic tasks, but we found no good treatment that describes the wide variety of hash functions employed in these applications.
1
1
1
1/4 Hash functions in genomic sequence analysis ( https://t.co/6sP8TF6LxQ) : a new survey written together with @shaomingfu, @kanatos92, @xianglipsu, and Qian Shi. Before submitting it, we are posting it online to get feedback from the community.
dropbox.com
Shared with Dropbox
1
19
40
The Wonderful Algorithms in Bioinformatics (WABI) conference 2025 will be held @UofMaryland, College Park in August. WABI brings together researchers working on algorithmic aspects of computational biology. Learn more: https://t.co/jtITU2HeLn Register: https://t.co/ZybzovEWXN
0
3
4
A monumental collaborative effort with many incredible people ☺️ Proud to be part of this! https://t.co/MvmdCtFbFs
arxiv.org
Given a set $S$ of $n$ keys, a perfect hash function for $S$ maps the keys in $S$ to the first $m \geq n$ integers without collisions. It may return an arbitrary result for any key not in $S$ and...
2
13
38
Slides from my talk (with Kamil Jaron) on an history of k-mers in bioinformatics:
1
31
87
See our new paper on discovering extensive conservation of human introns across hundreds of other mammals, with Ilia Minkin
academic.oup.com
Abstract. Despite many improvements over the years, the annotation of the human genome remains imperfect. The use of evolutionarily conserved sequences pro
1
26
91
Thrilled to finally debut my lab site 🎉: https://t.co/Z6kJpccFta — which stands for Microbial Bioinformatics, a domain I purchased 9 years ago and has been waiting for this moment ever since!"
16
32
257
The 3rd annual Symposium on Big Data Algorithms for Biology (BDBio) is happening this May! Join us for insightful talks and discussions on comp-bio topics Register, submit abstract, and see program details at https://t.co/j3Li1tca2i
@iiscbangalore @cdsiisc @CBR_IISc
0
8
14