
Clara Isabel Meister
@clara__meister
Followers
2K
Following
74
Media
10
Statuses
124
Post-doc teaching a continuing studies program at ETH Zurich. Still figuring out how Twitter works... 🤦‍♀️
Zurich, Switzerland
Joined June 2019
The first Zurich Robotics is in 7 days! RSVP now for the September 24th @ETH_AI_Center: Barnabas Gavin Cangan (@gavincangan, ETHZ) on why robot hands are so hard and Caterina Caccavella (ZHAW / ETHZ) on bio-inspired active sensing. Link below.
1
4
14
Is it just me, or did Claude Code get a lot worse in the last month...
1
0
1
We hope our insights and opinions can help shape ongoing discussions about future research in generative AI! Joint work with many great authors, including @LauraManduchi @kpandey008 @StephanMandt @vincefort
0
0
1
Beyond these discussions, the paper also includes extensive pointers to relevant work—including surveys and key papers across subfields. We’ve continuously updated it to reflect the latest developments. It can thus 🤞be a valuable resource for just about anyone working in gen AI.
1
0
0
Core argument: scaling alone won’t deliver a “perfect” generative model. We highlight promising methods towards (1) broadening adaptability (robustness, causal/assumption-aware methods), (2) improving efficiency & evaluation, (3) addressing ethics (misinfo, privacy, fairness)
1
0
0
* The current landscape of generative models * Open technical challenges and research gaps * Implications for fairness, safety, and regulation * Opportunities for impactful future research
1
0
0
Generative AI has made huge progress, but we still lack sufficient understanding of its capabilities, limitations and potential societal impacts. This collaborative position paper (sparked by the Dagstuhl Seminar on Challenges + Perspectives in Deep Generative Modeling) examines:
1
0
0
Exciting news! Our paper "On the Challenges and Opportunities in Generative AI" has been accepted to TMLR 2025. đź“„
arxiv.org
The field of deep generative modeling has grown rapidly in the last few years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning...
1
2
10
PRs and recommendations for improvement very welcome!!
0
0
3
I've recently been fascinated by tokenization, a research area in NLP where I still think there's lots of headway! In an effort to encourage research, I made a small tokenizer eval suite (intrinsic metrics) with some features I found missing elsewhere:
github.com
Contribute to cimeister/tokenizer-analysis-suite development by creating an account on GitHub.
4
15
160
In short, Parity-aware BPE = minimal overhead + clear fairness gains. If you care about multilingual robustness, tokenization is low-hanging fruit. Joint work with @negarforoutan @DebjitPaul2 @joelniklaus @sina_ahm @ABosselut @RicoSennrich
1
0
5
What’s even more exciting: low- and medium-resource languages benefit the most. We see better vocabulary utilization and compression rates for these languages, highlighting the effectiveness of our approach in providing fairer language allocation.
1
0
7
Empirical results: Gini coefficient of tokenizer disparity (0 indicates a tokenizer's compression rates across languages are equal) improves by ~83% with global compression remaining very similar. On downstream task accuracy, improvements outnumber declines across configurations
1
0
6
It’s a drop-in replacement in existing systems that introduces minimal training-time overhead: if you already use a BPE tokenizer, formats and tokenization/detokenization at inference are unchanged. You just need language-labeled multilingual corpora and a multi-parallel dev set.
1
0
5
What changes from classical BPE? Only a small part of training. We compute frequency stats per language → when choosing the next merge, we pick it from the stats of the language with the worst compression rate, rather than from global stats. Everything else stays the same!
1
0
8
🚨New Preprint! In multilingual models, the same meaning can take far more tokens in some languages, penalizing users of underrepresented languages with worse performance and higher API costs. Our Parity-aware BPE algorithm is a step toward addressing this issue: 🧵
5
30
283
A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an LM trained from scratch in each situation! Our #acl2025nlp paper proposes an observational method to estimate this causal effect! Longer thread soon!
3
24
136
Do you want to quantify your model’s counterfactual memorisation using only observational data? Our #ACL2024NLP paper proposes an efficient method to do it :) No interventions required! You can also see how memorisation evolves across training! Check out Pietro's🧵for details :)
Happy to share our #ACL2024 paper: "Causal Estimation of Memorisation Profiles" 🎉 Drawing from econometrics, we propose a principled and efficient method to estimate memorisation using only observational data! See 🧵 +@clara__meister, Thomas Hofmann, @vlachos_nlp, @tpimentelms
0
3
35
Super excited and grateful that our paper received the best paper award at #ACL2024 🎉 Huge thanks to my fantastic co-authors — @clara__meister, Thomas Hofmann, @vlachos_nlp, and @tpimentelms — the reviewers that recommended our paper, and the award committee #ACL2024NLP
Happy to share our #ACL2024 paper: "Causal Estimation of Memorisation Profiles" 🎉 Drawing from econometrics, we propose a principled and efficient method to estimate memorisation using only observational data! See 🧵 +@clara__meister, Thomas Hofmann, @vlachos_nlp, @tpimentelms
7
7
76