Jonas Færch Lotz Profile
Jonas Færch Lotz

@jonasflotz

Followers
107
Following
368
Media
2
Statuses
28

PhD student @coastalcph @DIKU_Institut and @RockwoolFonden

Joined February 2021
Don't wanna be here? Send us removal request.
@NielsRogge
Niels Rogge
1 month
For people thinking that DeepSeek-OCR is the first model to render text as images, the University of Copenhagen already did this in 2023 Paper is called "Language Modelling with Pixels". They trained a Masked AutoEncoder (MAE) by rendering text as images and masking patches
24
57
541
@jonasflotz
Jonas Færch Lotz
6 months
We propose to consider multiple aspects of tokenizer behavior, beyond text compression alone, and combine these to predict the best-performing tokenizer for machine translation. Thanks a ton to the co-authors António V. Lopes, @PeitzStephan, @hndrstwn, and @EmiliLeonardo
1
0
0
@jonasflotz
Jonas Færch Lotz
6 months
Happy to share that our work on evaluating tokenizers has been accepted to #ACL2025! This is one of two projects I worked on during my internship @Apple.
1
2
3
@ilker_kesen
İlker Kesen @ EurIPS
6 months
Announcing our recent work “Multilingual Pretraining for Pixel Language Models”! We introduce PIXEL-M4, a pixel language model pretrained on four visually & linguistically diverse scripts: English, Hindi, Ukrainian & Simplified Chinese. #NLProc
1
3
12
@HEI
Natural Language Processing Papers
8 months
Overcoming Vocabulary Constraints with Pixel-level Fallback.
0
2
2
@rust_phillip
Phillip Rust
2 years
Introducing “Towards Privacy-Aware Sign Language Translation at Scale” We leverage self-supervised pretraining on anonymized videos, achieving SOTA ASL-to-English translation performance while mitigating risks arising from biometric data. 📄: https://t.co/hMY6eFo46D 🧵(1/9)
1
7
20
@YovaKem_v2
Yova Kementchedjhieva
2 years
✴ Hiring a Postdoctoral Researcher ✴ I am hiring a postdoc with a background in *vision and language processing*, on a 2/3 year contract. Application deadline: 15 Feb 2024 Start: ASAP Apply here: https://t.co/j4mw2GZ3VJ and contact me here or via email. #NLProc #hiring
1
20
47
@delliott
Desmond Elliott
2 years
In Text Rendering Strategies for Pixel Language Models with @jonasflotz @rust_phillip and @esalesk, we design new text renderers for visual language processing to improve performance or to squeeze the model down to just 22M parameters. See it in Poster Session 2 on Friday 8th.
1
4
15
@arankomatsuzaki
Aran Komatsuzaki
2 years
Text Rendering Strategies for Pixel Language Models Simple character bigram rendering brings improved performance, which makes a model perform on par with 3x larger model https://t.co/kzU77g6vLO
1
29
112
@_akhaliq
AK
2 years
Text Rendering Strategies for Pixel Language Models paper page: https://t.co/9F4Axat2fn Pixel-based language models process text rendered as images, which allows them to handle any script, making them a promising approach to open vocabulary language modelling. However, recent
0
10
55
@jhuclsp
JHU CLSP
2 years
"Text Rendering Strategies for Pixel Language Models” Draft link: coming soon! By @jonasflotz, @esalesk, Phillip Rust, Desmond Elliott TLDR — Compressing the input space of pixel-based language models improves performance and is more parameter-efficient.
1
1
5
@delliott
Desmond Elliott
2 years
I am hiring up to two Ph.D students to work on tokenization-free language modelling at the University of Copenhagen. Find out more about our group: https://t.co/F4ya0a2UfY and apply by November 15th and list me as a potential supervisor.
@ELLISforEurope
ELLIS
2 years
The portal is open: Our #ELLISPhD Program is now accepting applications! Apply by November 15 to work with leading #AI labs across Europe and choose your advisors among 200 top #machinelearning researchers! #JoinELLISforEurope #PhD #PhDProgram #ML
1
26
74
@EnguehardJoseph
Joseph Enguehard
2 years
@AlexisLitvine @ENSdeLyon Fascinating presentation by @jonasflotz on date recognition in historical parish records - part of a whole session devoted to advances in data extraction from tabular documents with T. Paquet, C.M. Dahl, @AlexisLitvine, still #AdvancedMethods Workshop @ENSdeLyon
0
1
4
@EnguehardJoseph
Joseph Enguehard
2 years
Looking forward to the Advanced Data Methods Workshop @ENSdeLyon on September 14-15! An opportunity to keep abreast of the latest research on cutting-edge topics such as remote sensing, OCR/HTR, table and map recognition. Details and registration:
Tweet card summary image
enguehard.tf
Forthcoming Workshop on Advanced Methods for Data Collection and Use at ENS de Lyon!
1
7
11
@delliott
Desmond Elliott
2 years
I'm hiring a postdoc for 18 months to work with us on tokenization-free NLP in Copenhagen. Applications are due by August 31: https://t.co/Bbefd9jxij. Don't hesitate to reach out with informal inquiries.
1
28
82
@delliott
Desmond Elliott
3 years
Looking forward to seeing everyone in Kigali for #ICLR2023! @rust_phillip will give his oral on the PIXEL LM on Wednesday in Oral 6 Track 5, 1500-1510 @esalesk will give a talk about Visual Text Representations at AfricaNLP on Friday at 1045-1125.
2
11
53
@NCRRCBP
Carsten B Pedersen
3 years
1/n Familial aggregation studies of diseases rely on familial information linked with health records. We investigated the completeness of familial links in the Danish Civil Registration System and described the future Danish Multi-Generation Register https://t.co/605MiK1O2c
Tweet card summary image
journals.sagepub.com
Aim: Linking information on family members in the Danish Civil Registration System (CRS) with information in Danish national registers provides unique possibili...
2
6
19
@RockwoolFonden
ROCKWOOL Fonden
3 years
Vi har udviklet et nyt værktøj til dig, der gerne vil blive klogere på velstand og ulighed i Danmark. Værktøjet er gratis og kan vise udviklingen i indkomst, sundhed og uddannelse på landsplan og kommunalt niveau. Dyk ned i tallene på https://t.co/Qb4jC6nDml #dkforsk #dkøko
0
9
16
@delliott
Desmond Elliott
3 years
📢 I am hiring a postdoc to join our project on pixel-based natural language processing. The position is based in Copenhagen 🇩🇰 for 18 months. Applications are due by March 29 https://t.co/ZvQtCoWXgH. Informal inquiries are welcome.
@delliott
Desmond Elliott
3 years
Thrilled to receive a grant from @VILLUMFONDEN to carry out blue-skies research on tokenization-free NLP https://t.co/yBRt2L3KgE I will hire Ph.Ds and Postdocs to build up the group so feel free to reach out. We're starting off with a paper at #ICLR2023 https://t.co/xwt7tpI2n6
0
20
32