Tomasz Limisiewicz Profile
Tomasz Limisiewicz

@TomLimi

Followers
551
Following
1K
Media
47
Statuses
231

Postdoctoral researcher at @meta Fair and @uwnlp , Interested in going into the inner workings of neural networks, multilingualism, and fairer NLP (he/him)

Seattle
Joined September 2021
Don't wanna be here? Send us removal request.
@TomLimi
Tomasz Limisiewicz
8 months
Excited to continue my research adventure as a postdoc at @uwnlp and @Meta ! I’ve joined @LukeZettlemoyer's fantastic lab. Together, we plan to rethink how LLMs perceive data to unlock their capabilities to uncharted language and, further, beyond text! [🦋posting]
5
2
120
@melaniesclar
Melanie Sclar @ NeurIPS
13 days
Carmen Sandiego is heading to #NeurIPS2025 - finally, a good use for this costume! I'm on the industry job market and organizing the agents + reasoning & planning workshop. Excited to chat about research (LLM robustness, reasoning, theory of mind), and job opportunities. DM me!
4
15
132
@hila_gonen
Hila Gonen
1 month
Considering a PhD/MSc in NLP? I’m hiring students this cycle! If you are passionate about making language models reliable and safe, eager about understanding and controlling language models, and would like to add to your research some multilingual flavor - apply to my group! 👇
16
102
737
@Yen_Ju_Lu
Yen-Ju Lu
2 months
🚀 Introducing the Latent Speech-Text Transformer (LST) — a speech-text model that organizes speech tokens into latent patches for better text→speech transfer, enabling steeper scaling laws and more efficient multimodal training ⚡️ Paper 📄 https://t.co/4nUsbC1YKF
7
16
34
@JulieKallini
Julie Kallini ✨
2 months
New paper! 🌈 In English, pie = 🥧. In Spanish, pie = 🦶. Multilingual tokenizers often share such overlapping tokens between languages. Do these “False Friends” hurt or help multilingual LMs? We find that overlap consistently improves transfer—even when it seems misleading. 🧵
1
21
100
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
3 months
🎥 Videos of our invited talks and the panel discussion are now also available on YouTube: https://t.co/fFbH7kYkpZ ▶️
Tweet card summary image
youtube.com
Tokenization Workshop (TokShop) https://tokenization-workshop.github.io 1st edition co-located with ICML 2025: https://icml.cc/virtual/2025/workshop/39998
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
4 months
🎥 Videos from our Tokenization Workshop are now live! Watch invited talks, panel discussions, and the best paper presentation at https://t.co/Sc3KWHOS5r #ICML2025 #Tokenization #NLProc #LLMs
0
3
6
@TomLimi
Tomasz Limisiewicz
4 months
+ all the poster spiels recorded live.
0
0
0
@TomLimi
Tomasz Limisiewicz
4 months
TokShop videos are finally out! 🎥🤩 Check out the great talks from @yuvalpi (Join them? beat them? Fix them?) @delliott (Pixel LM) @AdrianLancucki (dynamic segmentation) . panel with hot takes from 🔥: @alisawuffles @_albertgu @yuvalpi @magikarp_tokens @kroscoo
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
4 months
🎥 Videos from our Tokenization Workshop are now live! Watch invited talks, panel discussions, and the best paper presentation at https://t.co/Sc3KWHOS5r #ICML2025 #Tokenization #NLProc #LLMs
1
2
12
@TomLimi
Tomasz Limisiewicz
4 months
BPE tokenization has been a safe bet for language models for almost 10 years now. 😮 So cool to see the status quo being challenged by yet another lab in recent weeks! 🔥
@Aleph__Alpha
Aleph Alpha
4 months
Introducing two new tokenizer-free LLM checkpoints from our research lab: TFree-HAT 7B Built on our Hierarchical Autoregressive Transformer (HAT) architecture, these models achieve top-tier German and English performance while processing text on a UTF-8 byte level.
0
0
10
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
5 months
🏆 Announcing our Best Paper Awards! 🥇 Winner: "BPE Stays on SCRIPT: Structured Encoding for Robust Multilingual Pretokenization" https://t.co/m84BWBuY46 🥈 Runner-up: "One-D-Piece: Image Tokenizer Meets Quality-Controllable Compression" https://t.co/mrha4PrK4c Congrats! 🎉
0
4
17
@soldni
Luca Soldaini 🎀
5 months
most controversial statement so far from @alisawuffles: "tokenization research is not as cool" **very vocals disagreements from crowd of tokenization nerds**
@valentina__py
Valentina Pyatkin
5 months
🔥tokenization panel!
3
4
58
@TomLimi
Tomasz Limisiewicz
5 months
Panel on Future of Tokenization is happening now in Meeting 111-112. With: @alisawuffles @_albertgu @yuvalpi @magikarp_tokens @kroscoo Moderated by: @esalesky
0
2
29
@kroscoo
Kris Cao
5 months
Full house at the @tokshop2025 tokenization workshop at #ICML2025 today!
0
3
33
@TomLimi
Tomasz Limisiewicz
5 months
Check the Byte Latent Transformer poster at @tokshop2025. It’s just fortaste before the main presentation soon at @aclmeeting from @ArtidoroPagnoni!
1
7
78
@TomLimi
Tomasz Limisiewicz
5 months
Happening now in Meeting 112 -113 @icmlconf !
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
5 months
Three invited speakers will share their insights at TokShop! Hear from Yuval Pinter @yuvalpi, Desmond Elliott @delliott, and Adrian Łańcuck @AdrianLancuckii on cutting-edge tokenization research. Don't miss these keynote presentations! #ICML2025 https://t.co/yAwjLwyvaV
1
0
2
@TomLimi
Tomasz Limisiewicz
5 months
Looking forward for out panel at 3:30. We’ll talk about future of tokenization: BLT, SuperBPE @alisawuffles , H-nets @_albertgu and further breakthroughs in tokenization @yuvalpi @magikarp_tokens @kroscoo https://t.co/0aruzPfekj
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
5 months
🎤 Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at #ICML2025.
0
0
4
@TomLimi
Tomasz Limisiewicz
5 months
It’d be great to meet at Tokenization Workshop @tokshop2025 @icmlconf tomorrow July 18 starting at 8:45 in Meeting 112-113!
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
5 months
The TokShop schedule is now live! Join us at #ICML2025 for invited talks, poster sessions, and a panel on the future of tokenization. https://t.co/UCdWdobEgh #Tokenization #LLM #NLProc
1
1
9
@TomLimi
Tomasz Limisiewicz
5 months
I'm pleased to be in Vancouver for @icmlconf this week 🇨🇦🤖. I'll be happy to chat about multilingual, multimodal LMs and tokenization(free).
2
3
93
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
5 months
🎤 Meet our expert panelists! Join Albert Gu, Alisa Liu, Kris Cao, Sander Land, and Yuval Pinter as they discuss the Future of Tokenization on July 18 at 3:30 PM at TokShop at #ICML2025.
0
10
38
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
6 months
Got a good tokenization paper under review at COLM, but the scores were a letdown? 😬 Why bother with rebuttal when the perfect venue is right around the corner! Submit your paper to the #ICML2025 Tokenization Workshop (TokShop) by May 30! 🚀
0
6
7
@tokshop2025
Tokenization Workshop (TokShop) @ICML2025
7 months
📝 Submit papers (up to 9 pages, shorter submission ) via OpenReview: https://t.co/eX4ACk7oxf 🗓️ Important dates: Deadline: May 30, 2025 Notifications: June 9, 2025 Workshop: July 18, 2025 Both archival and non-archival options available! #ICML2025 #TokShop #ML #NLProc
openreview.net
Welcome to the OpenReview homepage for ICML 2025 Workshop TokShop
0
3
3