PM @pashadag@genomic.social @pashadag.bsky.social @pashadag X Profile

PM @[email protected] @pashadag.bsky.social

@pashadag

Followers

1K

Following

2K

Media

26

Statuses

2K

Algorithmic Bioinformatics Researcher and Teacher. Tweets about research results and educational/mentorship topics (for details, see https://t.co/FUl9nGstuH).

https://t.co/SGB8OoIQYL

University Park, Pennsylvania

Joined April 2009

Don't wanna be here? Send us removal request.

Rayan Chikhi

@RayanChikhi

5 days

🌎👩‍🔬 For 15+ years biology has accumulated petabytes (million gigabytes) of🧬DNA sequencing data🧬 from the far reaches of our planet.🦠🍄🌵 Logan now democratizes efficient access to the world’s most comprehensive genetics dataset. Free and open. https://t.co/dDBtAjfdYL

5

145

362

Yaron Orenstein

@OrensteinYaron

2 months

🚨 Attention US PhDs in Bioinformatics: Interested in a postdoc in my group at Bar-Ilan University 🇮🇱 through the Fulbright Israel fellowship (2026–2027)? 🧬 2 years of research 💰 $95,000 stipend + relocation + benefits 📅 Apply by 15/9 🔗 https://t.co/iXoYWEEaO3 ✉️ Email me!

0

2

3

Alessio Campanelli

@AlessioCampa_

2 months

With @giulio_pibiri and @nomad421, we are about to release Fulgor v4.0.0, which is much faster to build and query, without affecting its memory efficiency. You can find all the information in our #WABI2025 paper ( https://t.co/foLYdhsC2Z). 🧵 (1/7)

1

7

12

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

5/n The code and paper are available: 🔗 paper: https://t.co/79Ety1S3Uk 🔗 code:

github.com

This tool estimates substitution rate based on k-spectrum - medvedevgroup/Repeat-Aware_Substitution_Rate_Estimator

0

6

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

4/n The traditional "repeat-oblivious" estimator can *overestimate mutation rates by an order of magnitude* on repetitive data. In contrast, the new estimator remains accurate across a broad range of rates and repetitive sequences (e.g. RBMY gene, α-satellite centromeres).

1

0

2

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

3/n Capturing the full repeat structure in the estimator is pretty hard and possibly not even needed. Instead, we account for the most pertinent part of the repeat structure in the estimator and the rest of the structure is accounted for in the bias formula.

1

0

2

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

2/n Tools such as Mash estimate the mutation rate via k-mer Jaccard similarity, assuming *non-repetitive* sequences. But in highly repetitive regions (e.g., α-satellite DNA), these estimates break down. We derive a novel estimator by relaxing the non-repetitive assumption.

1

0

2

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

🧵1/n Estimating mutation rates using k-mers is fast—but what happens when repeats dominate the genome? In a new preprint, @HaonanWu_1998, Antonio Blanca, and myself propose a *repeat-aware* estimator that's accurate even in centromeres.

bioRxiv Bioinfo

@biorxiv_bioinfo

3 months

A k-mer-based estimator of the substitution rate between repetitive sequences https://t.co/0HZxXkVWRv #biorxiv_bioinfo

1

14

24

Li Song

@mourisl

3 months

Our lab is hiring! We are looking for a postdoc in the area of immunology/microbiology+ML, or in pure method/software development. More information about our lab at:

mourisl.github.io

Li Song Lab

1

19

36

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

5/4 This is a draft manuscript and we hope to receive feedback from the community. You can submit a GitHub issue using https://t.co/m8QFDSHhFi or email the authors privately

0

1

2

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

4/4 We further clarify common misconceptions, e.g. the confusion between uniformity and regularity, the discrepancy between the original SimHash for vectors and the folklore version commonly used for estimating similarities among sets.

1

2

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

3/4 We propose a categorization of hashing methods based on their properties, design goals, and application context.

1

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

2/4 We provide a comprehensive overview of hash functions used in genomics. Hashing is central to many genomic tasks, but we found no good treatment that describes the wide variety of hash functions employed in these applications.

1

PM @[email protected] @pashadag.bsky.social

@pashadag

3 months

1/4 Hash functions in genomic sequence analysis ( https://t.co/6sP8TF6LxQ) : a new survey written together with @shaomingfu, @kanatos92, @xianglipsu, and Qian Shi. Before submitting it, we are posting it online to get feedback from the community.

dropbox.com

Shared with Dropbox

1

19

40

CBCB_UMD

@CBCB_UMD

3 months

The Wonderful Algorithms in Bioinformatics (WABI) conference 2025 will be held @UofMaryland, College Park in August. WABI brings together researchers working on algorithmic aspects of computational biology. Learn more: https://t.co/jtITU2HeLn Register: https://t.co/ZybzovEWXN

0

3

4

Giulio Ermanno Pibiri

@giulio_pibiri

3 months

A monumental collaborative effort with many incredible people ☺️ Proud to be part of this! https://t.co/MvmdCtFbFs

arxiv.org

Given a set $S$ of $n$ keys, a perfect hash function for $S$ maps the keys in $S$ to the first $m \geq n$ integers without collisions. It may return an arbitrary result for any key not in $S$ and...

2

13

38

Rayan Chikhi

@RayanChikhi

3 months

Slides from my talk (with Kamil Jaron) on an history of k-mers in bioinformatics:

1

31

87

Steven Salzberg 💙💛

@StevenSalzberg1

6 months

See our new paper on discovering extensive conservation of human introns across hundreds of other mammals, with Ilia Minkin

academic.oup.com

Abstract. Despite many improvements over the years, the annotation of the human genome remains imperfect. The use of evolutionarily conserved sequences pro

1

26

91

Wei Shen 沈伟

@shenwei356

6 months

Thrilled to finally debut my lab site 🎉: https://t.co/Z6kJpccFta — which stands for Microbial Bioinformatics, a domain I purchased 9 years ago and has been waiting for this moment ever since!"

16

32

257

Chirag Jain

@chirgjain

6 months

The 3rd annual Symposium on Big Data Algorithms for Biology (BDBio) is happening this May! Join us for insightful talks and discussions on comp-bio topics Register, submit abstract, and see program details at https://t.co/j3Li1tca2i @iiscbangalore @cdsiisc @CBR_IISc

0

8

14