Thomas Dehaene
@TDehaene
Followers
581
Following
1K
Media
92
Statuses
544
NLP engineer and meme sommelier
Belgium
Joined September 2011
We remain committed to our partnership with OpenAI and have confidence in our product roadmap, our ability to continue to innovate with everything we announced at Microsoft Ignite, and in continuing to support our customers and partners. We look forward to getting to know Emmett
4K
14K
88K
I asked ChatGPT to rewrite Bohemian Rhapsody to be about the life of a postdoc, and the output was flawless:
135
2K
9K
π Lama Cleaner: A free and open-source inpainting tool powered by SOTA AI model π Completely Free and Open-Source β’ Fully self-hosted β’ Multiple SOTA AI models β’ Classical image inpainting algorithm β’ CPU & GPU β’ Various inpainting strategy β’ Run as a Desktop APP
36
468
3K
It's that time of the year again π! https://t.co/AJlWUecwNk Everything to know about the past year of happenings in #AI π. My favorite graph: a showcase of how cross-modal #transformers have become π
0
0
6
2
3
30
Note: if the interactive version is down for some reason, you can check out the Medium version:
0
0
1
#Transformers π€ for #summarization are sometimes prone to hallucinations π΅βπ«. Our #NLP team investigated some post-processing steps in an interactive blogpost on @huggingface π https://t.co/PSWr5WzGky
1
2
5
Machine Learning Operations (MLOps): Overview, Definition, and Architecture https://t.co/2MHAoBjfRN
55
824
4K
For those playing along at home, here's a "AI is sentient!" argument bingo card.
78
506
2K
π€© LayoutLMv3 by @Microsoft is now available @huggingface! π₯ The model replaces the CNN backbone of its predecessor with (much simpler) patch embeddings Γ la ViT. π SOTA performance on all document AI benchmarks, both image-only and text+image! (1/2)
8
98
541
π‘ Adapting semantic search to a new domain with unlabeled data π? Using #GPL from @UKPLab, we demonstrate how to do this on non-English datasets π©πͺ . π:
blog.ml6.eu
Lexical based information retrieval systems are great for quickly fetching relevant information in a large text corpus. However, theseβ¦
0
0
1
Introducing the biggest change to https://t.co/dKZjob5Hcx since its inception π₯ The Community Tab π₯³ read more and discuss: https://t.co/SI07ac47Z1
3
55
185
π I wrote an #NLP blogpost detailing how we optimized our Dutch #GPT2 model to reduce our Google Cloud bill by a factor of 2.4 π€ ! π https://t.co/DCU4XJV7Bt
0
0
6
π¨ new model alert π¨. We're releasing: βοΈ translated Dutch summarization datasets (CNN-nl & XSUM-nl) β a finetuned mBART model for Dutch summarization on @huggingface ! https://t.co/WQjkNsFL2t
https://t.co/RUsmTF8ck7
0
1
15
When your funding round raises about as much dough as a BERT-Base model has parameters π€―π΅! Big congrats to @huggingface πͺ!
1
0
3
Our official @huggingface HuggingMugs (or MuggingFaces π€) have arrived βοΈ! And already our Rouge scores have improved, our Perplexities are down and my eyesight is better π€―
1
4
34
The @amazon MASSIVE dataset is... well... massive π±! One million realistic, parallel, labeled virtual-assistant utterances in 51 languages ποΈπ.
amazon.science
MASSIVE dataset and Massively Multilingual NLU (MMNLU-22) competition and workshop will help researchers scale natural-language-understanding technology to every language on Earth.
0
1
4
π€ Trainer now sports --optim adamw_bnb_8bit which activates the 8-bit Adam optimizer https://t.co/X1KdSIZZNC and uses 6 bytes less per param for training. ~1/3 of total memory savings! Huge thanks to @ManuelCiosici & @Tim_Dettmers for integrating it! Use transformers@main
github.com
Library for 8-bit optimizers and quantization routines. - facebookresearch/bitsandbytes
1
17
102