maieutic
@maieuticlab
Followers
23
Following
11
Media
0
Statuses
11
This is the official account of the Maieutic Lab at JHU. We broadly work on Multilingual NLP and AI. The photos are what OpenAI's DALL-E "thinks" we are...
Baltimore, MD
Joined November 2022
Super excited for the 1st Workshop on Multilingual Data Quality Signals (WMDQS) which is happening at #COLM2025 tomorrow. We are focused on looking all the way back to the web data that goes into all your LLMs and how we can do better at multilingual. Stop by! @COLM_conf
0
4
17
Check out our new lab website! https://t.co/ejVaEe4Gcy We are always looking for new collaborations - so reach out if you like one of our projects.
0
1
1
Congrats to Dr. Xu @fe1ixxu on a successful thesis defense of "Minimizing Language Interference for Multilingual Models". The thesis covered 9 of his first author papers encompassing: Modeling ( https://t.co/VWXCvZyUrG,
https://t.co/pOIq2jpoeg,
https://t.co/9HXuUhBXXz)
3 years, I am officially Dr. Xu now!! Big thanks to my advisors: @kentonmurray and Philipp Koehn. I can't achieve this without you!
1
2
12
Introducing X-ALMA: a 50 language multilingual machine translation model. It’s average translation performance outperforms other multilingual models (including ones focused on fewer num of langs) pushing against the curse of multilinguality.
Multilingual models are usually heavily skewed in favor of high-resource languages. We change this with X-ALMA: an LLM-based translator committed to ensuring top-tier performance across 50 diverse languages, regardless of their resource levels! Paper: https://t.co/O4M5LDGdAB
1
4
17
📢When LLMs solve tasks with a mid-to-low resource input/target language, their output quality is poor. We know that. But can we pin down what breaks inside the LLM? We introduce the 💥translation barrier hypothesis💥 for failed multilingual generation. https://t.co/VnrOWdNPr8
2
15
44
A nice summary of some of our work at NAACL earlier this year for a non-technical audience.
A new study from Johns Hopkins researchers @nikhilsksharma, @ZiangXiao, and @kentonmurray finds that multilingual #AI privileges dominant languages, deepening divides rather than democratizing access to information. Read more:
0
2
2
عسلامة, As part of IWSLT 2023, we are hosting a Tunisian-to-English Speech Translation Shared Task and would love people to participate. The evaluation campaign runs April 1st-15th. More details can be found here: https://t.co/0GbDB8cNFr Yaishek
iwslt.org
Home of the IWSLT conference and SIGSLT.
0
5
11
How to fine-tune a model when you have close to no labeled data? Check out our new paper "Language Agnostic Code-Mixing Data Augmentation by Predicting Linguistic Patterns" which introduces a zero-cost code-mixing generation method for sentiment analysis: https://t.co/FC9goOJcRI
0
4
50
Check out our #EMNLP2022 paper on improving the isomorphism of word embedding spaces for bilingual lexicon induction! Led by the awesome @cheeesio
The ability to extract accurate translation dictionaries from monolingual embedding spaces depends critically on their geometric similarity--"degree of isomorphism." We address this root-cause of faulty X-lingual mapping with ✨IsoVec✨ https://t.co/kOJ6DOEYU4
#EMNLP2022 🧵1/N
0
4
27
How easy is it to have big gains for your model? Just pass the model multiple times and minimize their difference! The secrete is more balanced parameter contribution. Check our #EMNLP2022 paper "The Importance of Being Parameters: An Intra-Distillation Method for Serious Gains"!
2
13
69
Our profile picture is DALL-E's interpretation of "Multilingual Artificial Intelligence" and the banner image is "Scientists researching speech and language at Johns Hopkins University in the style of El Greco"
0
1
2