Jasmijn Baaijens
@jasmijnbaaijens
Followers
222
Following
195
Media
8
Statuses
70
Assistant prof @tudelft. Microbial genome assembly & graph-based algorithms.
Joined December 2014
BioSB takes place June 18-19, 2024, in Egmond aan Zee, the Netherlands. The call for posters and demo presentations is open until May 17th. Submit your work and join BioSB this year!
aanmelder.nl
0
0
0
The program for BioSB2024 is online! A balanced mix of talks on recent advances in bioinformatics and systems biology, with exciting keynote lectures by Jan Hasenauer, @mariabrbic, @RenskeVroomans and Mohammed El-Kebir. Check it out: https://t.co/Q2lJfwYbAT
@BioSB_nl
aanmelder.nl
1
1
0
Nomination open for the BioSB Young Investigator Award 2024! Do you know a young bioinformatics or systems biology researcher who you think deserves this award? Nominate him or her until April 12! Info: https://t.co/yfEavPM9Gx
#bioinformatics #systemsbiology #biosb2024 #biosb
aanmelder.nl
1
7
4
Happy to share the final schedule of DSB 2024 online at https://t.co/R8GHPXxPCc please check it out and feel free to distribute! @CNRS @CNRSInformatics @umontpellier @lirmm_
0
6
7
🔬🧬Join our dynamic research team @rki_de !🎓3 exciting opportunities for PhD and PostDoc candidates in #metagenomics, @nanopore, #DNAModifications, and #pangenomics, within the realm of #PublicHealth. 🌍 📢 Motivated and curious? Please PM and help make a difference. 🌟
2
10
22
For more experiments and all details see our preprint! While the motivation for this project was SARS-CoV-2, the approach is generic and works for other viruses as well, as long as a collection of known (representative) genomes is available. 15/
biorxiv.org
Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains or species to infer the relative abundances of sequences present in a sample. This observation plays an...
0
1
3
What’s more, we also evaluated if our amplicons are future-proof. For amplicons based on sequences up to August 2022, we checked for the following months (Sept 2022 - May 2023) the fraction of sequences for which our primers would still work. Spoiler: they generally work! 14/
1
0
1
But what does this mean for the abundance estimation pipeline? We show on simulated sequencing data that prediction accuracy increases as we add more amplicons, and that it is similar to the prediction accuracy using (idealized) whole genome sequencing. 13/
1
0
1
We then computed the combination of 10 most informative amplicons of a fixed length. Note that with only 3 amplicons of 400 bp we can already distinguish between 86% of pairs of input sequences. 12/
1
0
1
So that is exactly what AmpliDiff does: based on an MSA of a representative set of genomes, we build candidate amplicons and then iteratively select the most informative amplicon *if primers exist*. 11/
1
0
1
For a second we thought that this plot answered our question, but then we realized it is not so straightforward: we also need primers that can bind to the input genomes. In other words, we are looking for peaks surrounded by conserved regions. 10/
1
0
1
The first step then was to look into pairwise differentiability for all lineages known at the time (August 2022). For a given fixed-length region of the SARS-CoV-2 genome, between how many lineages can we distinguish? We saw the expected peaks in spike and N: 9/
1
1
2
@JaspervB_TUD translated this question into a mathematical optimization problem and quickly realized that it is NP-hard; thus, we decided that a greedy algorithm would be our best bet. 8/
1
0
1
These observations led us to the main question behind the AmpliDiff project: if we want to amplify only a fraction of the genome while maximizing our ability to differentiate between lineages, which genomic regions should we amplify? 7/
1
0
1
We did some benchmarking with simulated data, comparing whole genome amplification to spike-only, and predictions on spike-only data were very accurate! In fact, better than whole genome (given a similar number of reads). 6/
1
1
1
At the same time, we were developing VLQ, a pipeline for estimating lineage abundances from wastewater sequencing data. Whole genome or selective amplification, VLQ can work with either kind of sequences. 5/
genomebiology.biomedcentral.com
Effectively monitoring the spread of SARS-CoV-2 mutants is essential to efforts to counter the ongoing pandemic. Predicting lineage abundance from wastewater, however, is technically challenging. We...
2
1
1
And this short sequence was enough to distinguish between major variants, and on top of that, they were able to discover these new cryptic lineages! 4/
1
0
1
Contrary to most of the other studies doing SARS-CoV-2 sequencing from wastewater, they did not attempt to amplify the whole genome; instead, they selected a highly variable region of ~250 bp in the spike gene, generated primers for PCR and amplified only this short sequence. 3/
1
0
1
This project came to life while working on wastewater sequencing for monitoring the spread and evolution of SARS-CoV-2. @ProfSmyth was leading a project analyzing New York City wastewater, where they discovered novel cryptic SARS-CoV-2 lineages. 2/
nature.com
Nature Communications - To monitor the presence of novel SARS-CoV-2 variants in New York City, Smyth et al. perform deep-sequencing of the receptor binding domain of S protein in wastewater samples...
1
0
1
Very excited to share our latest work, AmpliDiff, an optimized amplicon sequencing approach to estimating lineage abundances in viral metagenomes. A project led by @JaspervB_TUD and in close collaboration with @ProfSmyth 1/
biorxiv.org
Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains or species to infer the relative abundances of sequences present in a sample. This observation plays an...
1
6
13