constanzafierro Profile Banner
Constanza Fierro Profile
Constanza Fierro

@constanzafierro

Followers
500
Following
2K
Media
24
Statuses
418

PhD fellow @coastalcph doing NLP things. Ex SWE @Google 🇫🇷🥖 and student @dccuchile 🇨🇱. I also like sports, beer, reading, and photography.

Joined April 2010
Don't wanna be here? Send us removal request.
@constanzafierro
Constanza Fierro
3 years
@coastalcph on the coast 🇮🇪
Tweet media one
0
2
16
@constanzafierro
Constanza Fierro
9 months
Heading to Miami to present this work👇🌴. See you at the poster session on Tuesday 12th at 11-12:30 if you want to chat!. #EMNLP2024.
@constanzafierro
Constanza Fierro
11 months
Does GPT-4 truly “know” that the Earth is round? 🤔. First, some definitions 💡. “Defining Knowledge: Bridging Epistemology and Large Language Models”. 📄🗣️ #EMNLP2024.w/ @eclecticruchira @filippoSt @nGarneau
Tweet media one
0
0
17
@grok
Grok
7 days
Turn old photos into videos and see friends and family come to life. Try Grok Imagine, free for a limited time.
712
1K
5K
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 7/🧵Finally, our findings suggest that while the relation and subject representations in these models are multilingual and can generalize across languages 🗺, the object extraction phase appears to be language-specific📍.
0
0
1
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 6/🧵And what about language encoding? 🔍Our activation patching experiments reveal two phases in XGLM: first, the relation flows to the last token, followed by the language flow. But in mT5, relation and language flow together through similar layers.
Tweet media one
1
0
2
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 5/🧵At the final prediction stage (when the object is extracted), XGLM uses both feed-forward and attention sub-layers—unlike English models where attention dominates. In mT5, cross-attention plays a central role, but later in the process.
Tweet media one
1
0
0
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 4/🧵We also tracked the flow of information. Similar to English-only decoder models, the subject token propagates to the last token representation at the end, while the non-subject tokens are attended throughout all the layers.
Tweet media one
1
0
0
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 3/🧵Remember the “early site” found in GPT (Meng et al., 2022) and Mamba (Sharma et al., 2024) ? Interestingly, we don’t find it in XGLM, and in mT5, we observe a causal effect of the MLP layers across all encoder layers.
Tweet media one
Tweet media two
1
0
0
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 2/🧵We investigated two multilingual LLMs: XGLM (decoder-only) and mT5 (encoder-decoder), across 10 diverse languages. Spoiler alert: We found notable architecture-dependent differences 🧐.
1
0
0
@constanzafierro
Constanza Fierro
10 months
@negarforoutan @delliott 1/🧵Previous studies have focused on how English LLMs encode and recall knowledge. But in a multilingual LLM, does this process change?.
1
0
0
@constanzafierro
Constanza Fierro
10 months
How do multilingual models store and retrieve factual knowledge in different languages? And do these mechanisms vary across languages? 🗺️. In our newly released paper, we explore these questions!. 📄 🤗 @negarforoutan @delliott . 🧵👇
Tweet media one
2
8
82
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 6/🧵 We hope that the connection we provide to epistemology can inform and motivate better evaluations and claims regarding knowledge in LLMs 🚀.
0
0
2
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 5/🧵 We also ask philosophers and computer scientists for their opinion on this matter: can we even say that an LLM knows a fact? We find some intriguing discrepancies between the two groups 🌗.
1
0
2
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 4/🧵Using the definitions, we identify gaps in how current NLP research conceptualizes knowledge compared to epistemology 🔍. We also highlight open questions and new research opportunities to improve knowledge evaluations.
1
0
1
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 3/🧵Philosophers have thought and discussed about “knowledge” for a long time. So we look to epistemological research for definitions of knowledge 🧐 and interpret them in the context of LLMs 📝.
1
0
1
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 2/🧵 However, inconsistencies may arise with paraphrases or with logically related facts 💥. Would it really know that “Lionel Messi plays for Inter Miami” if it fails to predict where Lionel Messi resides? ⚽️.
1
0
1
@constanzafierro
Constanza Fierro
11 months
@eclecticruchira @filippoSt @nGarneau 1/🧵 LLMs seem to know a lot of facts about the world, GPT-4 correctly completes “The capital of Germany is ___” with Berlin, so it must know that, right?.
1
0
1
@constanzafierro
Constanza Fierro
11 months
Does GPT-4 truly “know” that the Earth is round? 🤔. First, some definitions 💡. “Defining Knowledge: Bridging Epistemology and Large Language Models”. 📄🗣️ #EMNLP2024.w/ @eclecticruchira @filippoSt @nGarneau
Tweet media one
2
4
13
@constanzafierro
Constanza Fierro
1 year
And thanks to all my co-authors who could not attend in person! Reinald Kim Amplayo, @nicola_decao, @maynez_joshua, Shashi Narayan and Mirella Lapata.
0
0
1
@constanzafierro
Constanza Fierro
1 year
Happy to have presented at #ACL2024 🇹🇭 along with @fantinehuot our work on generating text with citations!. Check the paper here 👉🏼
Tweet media one
Tweet media two
2
5
43
@constanzafierro
Constanza Fierro
1 year
RT @JIAANGLI: Do Vision and Language Models Share Concepts? 🤔👀🧠. We present an empirical evaluation and find that language models partially….
0
4
0
@constanzafierro
Constanza Fierro
1 year
Thanks to my coauthors ❤️ @YovaKem_v2 @ebugliarello and the feedback from @coastalcph.
0
0
2