Thomas Sounack Profile
Thomas Sounack

@tsounack

Followers
77
Following
38
Media
1
Statuses
26

AI/ML Engineer @ Dana-Farber Cancer Institute | Stanford alum

Joined May 2024
Don't wanna be here? Send us removal request.
@tsounack
Thomas Sounack
26 days
Very excited to share the release of BioClinical ModernBERT!. Highlights:.- biggest and most diverse biomedical and clinical dataset for an encoder.- 8192 context.- fastest throughput with a variety of inputs.- sota results across several tasks.- base and large sizes.(1/8).
4
13
65
@tsounack
Thomas Sounack
12 days
Exciting work from @neumll !.
@neumll
NeuML
12 days
🧬🔬⚕️ Building on the popularity of our PubMedBERT Embeddings model, we're excited to release a long context medical embeddings model!. It's built on the great work below from @tsounack. Model: Paper:
0
0
4
@tsounack
Thomas Sounack
21 days
Exciting to see BioClinical ModernBERT (base) ranked #2 among trending fill-mask models - right after BERT!. The large version is currently at #4. Grateful for the interest, and can’t wait to see what projects people apply it to!
Tweet media one
0
7
12
@tsounack
Thomas Sounack
22 days
Github link:
0
1
6
@tsounack
Thomas Sounack
22 days
BioClinical ModernBERT github repo is online! It contains:.- Our continued pretraining config files.- Performance eval code.- Inference speed eval code. Step-by-step guide on how to continue ModernBERT or BioClinical ModernBERT pretraining coming in the next few days!.
1
3
18
@tsounack
Thomas Sounack
22 days
RT @introsp3ctor: next demo visualizing BioClinical-ModernBERT-base embeddings on a sphere….
0
1
0
@tsounack
Thomas Sounack
23 days
RT @gm8xx8: BioClinical ModernBERT: A State-of-the-Art Long-Context Encoder for Biomedical and Clinical NLP. → Built on ModernBERT with 8K….
0
4
0
@tsounack
Thomas Sounack
26 days
RT @josephpollack: we are so back . "Mitochondria is the powerhouse of the [MASK]."
Tweet media one
0
2
0
@tsounack
Thomas Sounack
26 days
RT @joshp_davis: BioClinical ModernBERT is out!. Built on the largest, most diverse biomedical/clinical dataset to date.‼️Delivers SOTA acr….
0
2
0
@tsounack
Thomas Sounack
26 days
RT @jeremyphoward: Your daily reminder that fine tuning is just continued pretraining. Super cool results from @antoine_chaffin who is put….
0
55
0
@tsounack
Thomas Sounack
26 days
RT @antoine_chaffin: You can just continue pre-train things ✨.Happy to announce the release of BioClinical ModernBERT, a ModernBERT model w….
0
33
0
@tsounack
Thomas Sounack
26 days
RT @bclavie: Clinical encoders are joining the ModernBERT family ☺️.
0
7
0
@tsounack
Thomas Sounack
26 days
RT @LightOnIO: 🚀Announcing BioClinical ModernBERT, a SOTA encoder for healthcare AI, developed by Thomas Sounack @tsounack for Dana-Farber….
0
9
0
@tsounack
Thomas Sounack
26 days
Link to the models:.- - - (8/8).
0
0
9
@tsounack
Thomas Sounack
26 days
During benchmarking, we also observed substantially faster fine-tuning and inference with BioClinical ModernBERT. Combined with its long context support, enabling full clinical note processing in a single pass, it offers strong scaling potential for clinical NLP. (7/8).
1
0
6
@tsounack
Thomas Sounack
26 days
Excited to see how it performs on your data! In our internal evaluations, BioClinical ModernBERT significantly outperformed existing encoders - thanks to its training on diverse clinical data spanning multiple institutions, specialties, and countries. (6/8).
1
0
6
@tsounack
Thomas Sounack
26 days
Most clinical encoders underperform on de-identification tasks due to PHI masking in MIMIC. BioClinical ModernBERT uses datasets with realistic PHI surrogates, enabling more natural representations and stronger DEID performance. (5/8).
1
0
4
@tsounack
Thomas Sounack
26 days
Leveraging the training schedule of ModernBERT, we designed a two-step training process for continued pre-training. We release the checkpoints with our models, and in the next few days a guide for continued pre-training on BioClinical ModernBERT will be available. (4/8).
1
0
5
@tsounack
Thomas Sounack
26 days
At the @lindvalllab (with @joshp_davis and @DurieuxBrigitte), we collaborated with @antoine_chaffin from the ModernBERT team, @tompollard and @alistairewj from the MIMIC team, @lehmer16, and both @MattBMcDermott and @TristanNaumann from the Clinical BERT team. (3/8).
2
0
7
@tsounack
Thomas Sounack
26 days
Paper: Collection: (2/8).
1
2
10