
Ethan Gotlieb Wilcox
@weGotlieb
Followers
1K
Following
508
Media
39
Statuses
224
Asst. Prof. of Computational Linguistics @Georgetown. Formerly Postdoc @ETH, PhD @Harvard Ling, MIT Brain & Cog Sci. Language, Computers, Cognition.
Joined June 2012
Soon, I can already see the head Papers on pretraining, small models, cognitive modelling etc. are all welcome in the BabyLM workshop Submissions are due in the 15th of Aug direct from ARR or full papers Submission links: https://t.co/P1QuAbSdNt
1
4
11
Honored to have received a Senior Area Chair award at #ACL2025 for our Prosodic Typology paper. Huge shout out to the whole team: @CuiDing_CL, @tpimentelms, @a_stadt, @tamaregev!
4
4
56
It has also been nominated for best paper (title) based on a musical or song, although I hear competition is steep this year 🎵
1
0
2
Check out Xiulin's paper at #ACL2025 for new results on LMs learning "impossible" and typologically unattested languages!
🥳Thrilled to present our #ACL2025 paper with @t_aoyam, @yuekun_yao, and @weGotlieb on language modeling! Can LMs’ learning dynamics distinguish typologically attested from unattested and from truly impossible languages? We test this across 12 languages and find: yes, but…👀
1
0
11
To be presented at #ACL2025NLP on Wednesday, 30/7 at 11am! See Mario's great thread for more details on the paper.
What if the way we communicate is shaped by a hidden harmonic rhythm? In our #ACL2025NLP paper, we explore the hypothesis that the components of the information contour of a document vary periodically, with periods that correspond to the boundaries of structural units.
0
0
10
🌟🌟This paper will appear at @aclmeeting 2025! New updated version is on arXiv: https://t.co/uCYGu74nus 🌟🌟
⭐🗣️New preprint out: 🗣️⭐ “Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent” with @CuiDing_CL, Giovanni Acampa, @tpimentelms, @a_stadt, @tamaregev: https://t.co/uCYGu74nus
0
1
25
This project connects to some other recent papers seeking to cast typological variation in information-theoretic terms, with shout-outs to @MichaelaSocolof @postylem, @rljfutrell ( https://t.co/U2rmCNw37G) and @Julius_Steuer ( https://t.co/jfAiIjvHmA)
aclanthology.org
Julius Steuer, Johann-Mattis List, Badr M. Abdullah, Dietrich Klakow. Proceedings of the 5th Workshop on Research in Computational Linguistic Typology and Multilingual NLP. 2023.
0
0
3
⭐ ⭐This paper also makes several technical contributions to the mixed-pair mutual information estimation pipeline of Wolf et al., ( https://t.co/GMQ42JHRM8). Shout out to @CuiDing_CL for all her hard work on this aspect of the paper! ⭐⭐
aclanthology.org
Lukas Wolf, Tiago Pimentel, Evelina Fedorenko, Ryan Cotterell, Alex Warstadt, Ethan Wilcox, Tamar Regev. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
1
0
3
✅In line with our prediction, we find that mutual information is higher in tonal languages than in non-tonal languages. BUT, the way one represents context is important. When full sentential context is taken into account (mBERT and mGPT), the distinction collapses.
1
0
2
🌏🌍We test this prediction by estimating mutual information in an audio dataset of 10 different languages across 6 language families. 🌏🌍
1
0
1
We propose a way to do so using …📡information theory.📡 In tonal languages, pitch reduces uncertainty about lexical identity, therefore, the mutual information between pitch contours and words should be higher.
1
0
1
🌐But there are intermediate languages, which have lexically contrastive tone, but only sporadically, making some linguists doubt the tonal/non-tonal dichotomy. So, how can we measure how “tonal” a language is? 🧐🧐
1
0
1
🌏 Different languages use pitch in different ways. 🌏 Tonal languages, like Cantonese, use it to make lexical distinctions. 📖 While others, like English, use it for other functions, like marking whether or not a sentence is a question. ❓
1
0
2
⭐🗣️New preprint out: 🗣️⭐ “Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent” with @CuiDing_CL, Giovanni Acampa, @tpimentelms, @a_stadt, @tamaregev: https://t.co/uCYGu74nus
2
11
46
I’ll also use this as a way to plug human-scale language modeling in the wild: This year’s BabyLM eval pipeline was just released last week at https://t.co/iuZyPt77he. For more info on BabyLM, head to
github.com
Contribute to babylm/evaluation-pipeline-2025 development by creating an account on GitHub.
0
0
3
Couldn’t be happier to have co-authored this will a stellar team, including: @michahu8, @amuuueller, @a_stadt, @LChoshen, @ChengxuZhuang, @adinamwilliams, @ryandcotterell, @tallinzen
1
0
5
This version includes 😱New analyses 😱new arguments 😱 and a whole new “Looking Forward” section! If you’re interested in what a team of (psycho) computational linguists thinks the future will hold, check out our brand new Section 8!
1
0
1
📣Paper Update 📣It’s bigger! It’s better! Even if the language models aren’t. 🤖New version of “Bigger is not always Better: The importance of human-scale language modeling for psycholinguistics”
osf.io
Neural network language models can learn a surprising amount about language by predicting upcoming words in a corpus. Recent language technologies work has demonstrated that large performance...
1
5
19
Excited to share our preprint "Using MoTR to probe agreement errors in Russian"! w/ @CuiDing_CL, @weGotlieb, Z. Fuchs Link: https://t.co/OYbFOztWYJ 1- We provide moderate evidence that processing of agreement errors is modulated by agreement type (internal vs external agr.)
1
1
4
We are expecting🫄 A 3rd BabyLM👶, as a workshop @emnlpmeeting Kept: all New: Interaction (education, agentic) track Workshop papers More in 🧵
1
10
27