Jasmijn Baaijens @jasmijnbaaijens X Profile

Jasmijn Baaijens

@jasmijnbaaijens

Followers

222

Following

195

Media

8

Statuses

70

Assistant prof @tudelft. Microbial genome assembly & graph-based algorithms.

Joined December 2014

Don't wanna be here? Send us removal request.

Jasmijn Baaijens

@jasmijnbaaijens

2 years

BioSB takes place June 18-19, 2024, in Egmond aan Zee, the Netherlands. The call for posters and demo presentations is open until May 17th. Submit your work and join BioSB this year!

aanmelder.nl

0

Jasmijn Baaijens

@jasmijnbaaijens

2 years

The program for BioSB2024 is online! A balanced mix of talks on recent advances in bioinformatics and systems biology, with exciting keynote lectures by Jan Hasenauer, @mariabrbic, @RenskeVroomans and Mohammed El-Kebir. Check it out: https://t.co/Q2lJfwYbAT @BioSB_nl

aanmelder.nl

1

0

BioSB

@BioSB_nl

2 years

Nomination open for the BioSB Young Investigator Award 2024! Do you know a young bioinformatics or systems biology researcher who you think deserves this award? Nominate him or her until April 12! Info: https://t.co/yfEavPM9Gx #bioinformatics #systemsbiology #biosb2024 #biosb

aanmelder.nl

1

7

4

Eric Rivals

@EricRivals

2 years

Happy to share the final schedule of DSB 2024 online at https://t.co/R8GHPXxPCc please check it out and feel free to distribute! @CNRS @CNRSInformatics @umontpellier @lirmm_

0

6

7

Martin Hölzer

@martinhoelzer

2 years

🔬🧬Join our dynamic research team @rki_de !🎓3 exciting opportunities for PhD and PostDoc candidates in #metagenomics, @nanopore, #DNAModifications, and #pangenomics, within the realm of #PublicHealth. 🌍 📢 Motivated and curious? Please PM and help make a difference. 🌟

2

10

22

Jasmijn Baaijens

@jasmijnbaaijens

2 years

For more experiments and all details see our preprint! While the motivation for this project was SARS-CoV-2, the approach is generic and works for other viruses as well, as long as a collection of known (representative) genomes is available. 15/

biorxiv.org

Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains or species to infer the relative abundances of sequences present in a sample. This observation plays an...

0

1

3

Jasmijn Baaijens

@jasmijnbaaijens

2 years

What’s more, we also evaluated if our amplicons are future-proof. For amplicons based on sequences up to August 2022, we checked for the following months (Sept 2022 - May 2023) the fraction of sequences for which our primers would still work. Spoiler: they generally work! 14/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

But what does this mean for the abundance estimation pipeline? We show on simulated sequencing data that prediction accuracy increases as we add more amplicons, and that it is similar to the prediction accuracy using (idealized) whole genome sequencing. 13/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

We then computed the combination of 10 most informative amplicons of a fixed length. Note that with only 3 amplicons of 400 bp we can already distinguish between 86% of pairs of input sequences. 12/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

So that is exactly what AmpliDiff does: based on an MSA of a representative set of genomes, we build candidate amplicons and then iteratively select the most informative amplicon *if primers exist*. 11/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

For a second we thought that this plot answered our question, but then we realized it is not so straightforward: we also need primers that can bind to the input genomes. In other words, we are looking for peaks surrounded by conserved regions. 10/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

The first step then was to look into pairwise differentiability for all lineages known at the time (August 2022). For a given fixed-length region of the SARS-CoV-2 genome, between how many lineages can we distinguish? We saw the expected peaks in spike and N: 9/

1

2

Jasmijn Baaijens

@jasmijnbaaijens

2 years

@JaspervB_TUD translated this question into a mathematical optimization problem and quickly realized that it is NP-hard; thus, we decided that a greedy algorithm would be our best bet. 8/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

These observations led us to the main question behind the AmpliDiff project: if we want to amplify only a fraction of the genome while maximizing our ability to differentiate between lineages, which genomic regions should we amplify? 7/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

We did some benchmarking with simulated data, comparing whole genome amplification to spike-only, and predictions on spike-only data were very accurate! In fact, better than whole genome (given a similar number of reads). 6/

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

At the same time, we were developing VLQ, a pipeline for estimating lineage abundances from wastewater sequencing data. Whole genome or selective amplification, VLQ can work with either kind of sequences. 5/

genomebiology.biomedcentral.com

Effectively monitoring the spread of SARS-CoV-2 mutants is essential to efforts to counter the ongoing pandemic. Predicting lineage abundance from wastewater, however, is technically challenging. We...

2

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

And this short sequence was enough to distinguish between major variants, and on top of that, they were able to discover these new cryptic lineages! 4/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

Contrary to most of the other studies doing SARS-CoV-2 sequencing from wastewater, they did not attempt to amplify the whole genome; instead, they selected a highly variable region of ~250 bp in the spike gene, generated primers for PCR and amplified only this short sequence. 3/

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

This project came to life while working on wastewater sequencing for monitoring the spread and evolution of SARS-CoV-2. @ProfSmyth was leading a project analyzing New York City wastewater, where they discovered novel cryptic SARS-CoV-2 lineages. 2/

nature.com

Nature Communications - To monitor the presence of novel SARS-CoV-2 variants in New York City, Smyth et al. perform deep-sequencing of the receptor binding domain of S protein in wastewater samples...

1

0

1

Jasmijn Baaijens

@jasmijnbaaijens

2 years

Very excited to share our latest work, AmpliDiff, an optimized amplicon sequencing approach to estimating lineage abundances in viral metagenomes. A project led by @JaspervB_TUD and in close collaboration with @ProfSmyth 1/

biorxiv.org

Metagenomic profiling algorithms commonly rely on genomic differences between lineages, strains or species to infer the relative abundances of sequences present in a sample. This observation plays an...

1

6

13