Constanza Fierro @constanzafierro X Profile

Constanza Fierro

@constanzafierro

Followers

500

Following

2K

Media

24

Statuses

418

PhD fellow @coastalcph doing NLP things. Ex SWE @Google 🇫🇷🥖 and student @dccuchile 🇨🇱. I also like sports, beer, reading, and photography.

Joined April 2010

Don't wanna be here? Send us removal request.

Constanza Fierro

@constanzafierro

3 years

@coastalcph on the coast 🇮🇪

0

2

16

Constanza Fierro

@constanzafierro

9 months

Heading to Miami to present this work👇🌴. See you at the poster session on Tuesday 12th at 11-12:30 if you want to chat!. #EMNLP2024.

Constanza Fierro

@constanzafierro

11 months

Does GPT-4 truly “know” that the Earth is round? 🤔. First, some definitions 💡. “Defining Knowledge: Bridging Epistemology and Large Language Models”. 📄🗣️ #EMNLP2024.w/ @eclecticruchira @filippoSt @nGarneau

0

17

Grok

@grok

7 days

Turn old photos into videos and see friends and family come to life. Try Grok Imagine, free for a limited time.

712

1K

5K

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 7/🧵Finally, our findings suggest that while the relation and subject representations in these models are multilingual and can generalize across languages 🗺, the object extraction phase appears to be language-specific📍.

0

1

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 6/🧵And what about language encoding? 🔍Our activation patching experiments reveal two phases in XGLM: first, the relation flows to the last token, followed by the language flow. But in mT5, relation and language flow together through similar layers.

1

0

2

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 5/🧵At the final prediction stage (when the object is extracted), XGLM uses both feed-forward and attention sub-layers—unlike English models where attention dominates. In mT5, cross-attention plays a central role, but later in the process.

1

0

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 4/🧵We also tracked the flow of information. Similar to English-only decoder models, the subject token propagates to the last token representation at the end, while the non-subject tokens are attended throughout all the layers.

1

0

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 3/🧵Remember the “early site” found in GPT (Meng et al., 2022) and Mamba (Sharma et al., 2024) ? Interestingly, we don’t find it in XGLM, and in mT5, we observe a causal effect of the MLP layers across all encoder layers.

1

0

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 2/🧵We investigated two multilingual LLMs: XGLM (decoder-only) and mT5 (encoder-decoder), across 10 diverse languages. Spoiler alert: We found notable architecture-dependent differences 🧐.

1

0

Constanza Fierro

@constanzafierro

10 months

@negarforoutan @delliott 1/🧵Previous studies have focused on how English LLMs encode and recall knowledge. But in a multilingual LLM, does this process change?.

1

0

Constanza Fierro

@constanzafierro

10 months

How do multilingual models store and retrieve factual knowledge in different languages? And do these mechanisms vary across languages? 🗺️. In our newly released paper, we explore these questions!. 📄 🤗 @negarforoutan @delliott . 🧵👇

2

8

82

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 6/🧵 We hope that the connection we provide to epistemology can inform and motivate better evaluations and claims regarding knowledge in LLMs 🚀.

0

2

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 5/🧵 We also ask philosophers and computer scientists for their opinion on this matter: can we even say that an LLM knows a fact? We find some intriguing discrepancies between the two groups 🌗.

1

0

2

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 4/🧵Using the definitions, we identify gaps in how current NLP research conceptualizes knowledge compared to epistemology 🔍. We also highlight open questions and new research opportunities to improve knowledge evaluations.

1

0

1

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 3/🧵Philosophers have thought and discussed about “knowledge” for a long time. So we look to epistemological research for definitions of knowledge 🧐 and interpret them in the context of LLMs 📝.

1

0

1

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 2/🧵 However, inconsistencies may arise with paraphrases or with logically related facts 💥. Would it really know that “Lionel Messi plays for Inter Miami” if it fails to predict where Lionel Messi resides? ⚽️.

1

0

1

Constanza Fierro

@constanzafierro

11 months

@eclecticruchira @filippoSt @nGarneau 1/🧵 LLMs seem to know a lot of facts about the world, GPT-4 correctly completes “The capital of Germany is ___” with Berlin, so it must know that, right?.

1

0

1

Constanza Fierro

@constanzafierro

11 months

Does GPT-4 truly “know” that the Earth is round? 🤔. First, some definitions 💡. “Defining Knowledge: Bridging Epistemology and Large Language Models”. 📄🗣️ #EMNLP2024.w/ @eclecticruchira @filippoSt @nGarneau

2

4

13

Constanza Fierro

@constanzafierro

1 year

And thanks to all my co-authors who could not attend in person! Reinald Kim Amplayo, @nicola_decao, @maynez_joshua, Shashi Narayan and Mirella Lapata.

0

1

Constanza Fierro

@constanzafierro

1 year

Happy to have presented at #ACL2024 🇹🇭 along with @fantinehuot our work on generating text with citations!. Check the paper here 👉🏼

2

5

43

Constanza Fierro

@constanzafierro

1 year

RT @JIAANGLI: Do Vision and Language Models Share Concepts? 🤔👀🧠. We present an empirical evaluation and find that language models partially….

0

4

0

Constanza Fierro

@constanzafierro

1 year

Thanks to my coauthors ❤️ @YovaKem_v2 @ebugliarello and the feedback from @coastalcph.

0

2