Phillip Rust Profile
Phillip Rust

@rust_phillip

Followers
393
Following
872
Media
7
Statuses
50

Research Scientist @AIatMeta (FAIR) • PhD @coastalcph

Paris, France
Joined July 2020
Don't wanna be here? Send us removal request.
@rust_phillip
Phillip Rust
3 years
Happy to share our paper on language modelling with pixels has been accepted to ICLR‘23 (notable-top-5% / oral) 🎉. Big thanks and congrats to Team-PIXEL @jonasflotz @ebugliarello @esalesk @mdlhx @delliott and looking forward to presenting in Kigali! 🌍 #ICLR2023
@ebugliarello
Emanuele Bugliarello
3 years
Tired of tokenizers/subwords? Check out PIXEL, a new language model that processes written text as images📸 “Language Modelling with Pixels” 📄 https://t.co/pmp7Yvhx9W 🧑‍💻 https://t.co/RbMemZOpub 🤖 https://t.co/J80eju62eB by @rust_phillip @jonasflotz me @esalesk @mdlhx @delliott
9
34
231
@yilin_sung
Yi Lin Sung
1 month
Tough week! I also got impacted less than 3 months after joining. Ironically, I just landed some new RL infra features the day before. Life moves on. My past work spans RL, PEFT, Quantization, and Multimodal LLMs. If your team is working on these areas, I’d love to connect.
@cuijiaxun
Jiaxun Cui 🐿️ ✈️ NeurIPS
1 month
Meta has gone crazy on the squid game! Many new PhD NGs are deactivated today (I am also impacted🥲 happy to chat)
42
63
504
@awinyimgprocess
Jinpeng Wang
1 month
Humans see text — but LLMs don’t. I wrote a short blog post exploring how models can perceive text visually rather than tokenize it: 🔗 https://t.co/lk5Usj1WqT From PIXEL, CLIPPO, VisInContext, VIST to DeepSeek-OCR, this is a quick story of how vision-centric modeling is
csu-jpg.github.io
People read visually, not symbolically. Visual tokens and vision-centric MLLMs point to the next paradigm.
8
41
220
@rust_phillip
Phillip Rust
1 year
I will be presenting this work in-person at ACL🇹🇭 this week. Drop by if you'd like to chat! Oral: Today (Monday) 16:30 Poster: Tuesday (Tomorrow) 10:30 - 12:00
@rust_phillip
Phillip Rust
2 years
Introducing “Towards Privacy-Aware Sign Language Translation at Scale” We leverage self-supervised pretraining on anonymized videos, achieving SOTA ASL-to-English translation performance while mitigating risks arising from biometric data. 📄: https://t.co/hMY6eFo46D 🧵(1/9)
0
1
21
@rust_phillip
Phillip Rust
2 years
This project is a collaboration with my amazing peers and mentors during my internship @AIatMeta: Bowen Shi, @skylrwang, @ncihancamgoz @j_maillard. ⭐ 🧵(9/9)
0
0
5
@rust_phillip
Phillip Rust
2 years
For more experiments and all the details, check out our arXiv preprint linked above. We are working on releasing our code and data, so stay tuned! 👨‍💻 🧵(8/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
We also highlight the importance of pretraining on longer video clips to learn long-range spatio-temporal dependencies 🎬➡️🧠. Even when controlling for the number of video tokens seen, we observe a large boost in performance by scaling from 16 to 128 frames 🚀. 🧵(7/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
Face blurring incurs a loss of linguistic information in sign languages, leading to performance degradation. We show that such information, when lost during anonymized pretraining, can largely be recovered during finetuning. An effective privacy-performance trade-off ⚖️! 🧵(6/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
Our best models outperform the prior SOTA for ASL-to-English translation performance on How2Sign by over 3 BLEU in both the finetuned and zero-shot settings 🥇. 🧵(5/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
🌐 Optionally, an intermediate language-supervised pretraining (LSP) objective can help bridge the modality gap between sign language video inputs and text outputs. 🧵(4/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
Our method, SSVP-SLT, consists of: 🎥 Self-supervised video pretraining (SSVP) on anonymized, unannotated videos to learn high-quality continuous sign language representations. 🎯 Supervised finetuning on a curated SLT dataset to learn translation-specific information. 🧵(3/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
Training data scarcity and privacy risks are huge issues in sign language translation (SLT). Our approach is designed to be 🚀 scalable (by enabling training on unlabeled data) 🎭 privacy-aware (through anonymization) 🧵(2/9)
1
0
2
@rust_phillip
Phillip Rust
2 years
Introducing “Towards Privacy-Aware Sign Language Translation at Scale” We leverage self-supervised pretraining on anonymized videos, achieving SOTA ASL-to-English translation performance while mitigating risks arising from biometric data. 📄: https://t.co/hMY6eFo46D 🧵(1/9)
1
7
20
@gaotianyu1350
Tianyu Gao
2 years
New preprint "Improving Language Understanding from Screenshots" w/ @zwcolin @AdithyaNLP @danqi_chen. We improve language understanding abilities of screenshot LMs, an emerging family of models that processes everything (including text) via visual inputs https://t.co/Qr9h8EHjUv
6
45
186
@delliott
Desmond Elliott
2 years
In PHD: Pixel-Based Language Modeling of Historical Documents with @NadavBorenstein @rust_phillip and @IAugenstein, we apply pixel language models to processing historical document and to more standard NLP classification tasks too. See it in Poster Session 6 on Sunday 10th.
1
5
21
@delliott
Desmond Elliott
2 years
In Text Rendering Strategies for Pixel Language Models with @jonasflotz @rust_phillip and @esalesk, we design new text renderers for visual language processing to improve performance or to squeeze the model down to just 22M parameters. See it in Poster Session 2 on Friday 8th.
1
4
15
@yoavgo
(((ل()(ل() 'yoav))))👾
2 years
anon policy survey is out: https://t.co/hMYxg6EXE0
1
32
41
@AIatMeta
AI at Meta
2 years
Introducing SeamlessM4T, the first all-in-one, multilingual multimodal translation model. This single model can perform tasks across speech-to-text, speech-to-speech, text-to-text translation & speech recognition for up to 100 languages depending on the task. Details ⬇️
54
428
2K
@delliott
Desmond Elliott
3 years
📢 I am hiring a postdoc to join our project on pixel-based natural language processing. The position is based in Copenhagen 🇩🇰 for 18 months. Applications are due by March 29 https://t.co/ZvQtCoWXgH. Informal inquiries are welcome.
@delliott
Desmond Elliott
3 years
Thrilled to receive a grant from @VILLUMFONDEN to carry out blue-skies research on tokenization-free NLP https://t.co/yBRt2L3KgE I will hire Ph.Ds and Postdocs to build up the group so feel free to reach out. We're starting off with a paper at #ICLR2023 https://t.co/xwt7tpI2n6
0
20
32
@delliott
Desmond Elliott
3 years
Thrilled to receive a grant from @VILLUMFONDEN to carry out blue-skies research on tokenization-free NLP https://t.co/yBRt2L3KgE I will hire Ph.Ds and Postdocs to build up the group so feel free to reach out. We're starting off with a paper at #ICLR2023 https://t.co/xwt7tpI2n6
9
21
87