Manuel Fernández @manuFernandezBu X Profile

Manuel Fernández

@manuFernandezBu

Followers

51

Following

36

Media

5

Statuses

18

Buenos Aires, Argentina

Joined April 2025

Don't wanna be here? Send us removal request.

Plant Science

@sci_plant

2 months

What Large Language Models Know About Plant Molecular Biology https://t.co/pjvuC9fTgi ♻️

0

4

17

Laboratorio de Inteligencia Artificial Aplicada

@liaa_icc

2 months

Nuevo preprint de @manuFernandezBu y equipo What Large Language Models Know About Plant Molecular Biology https://t.co/nScN223kp2 Manuel Fernandez Burda, Lucia Ferrero, Nicolás Gaggion, Camille Fonouni-Farde, MoBiPlant Consortium, Martín Crespi, Federico Ariel, Enzo Ferrante

1

3

9

Matthias Benoit

@Matt_Ben_

2 months

🧬🌱New preprint! MoBiPlant is a benchmark built with 112 experts to test how LLMs understand plant molecular biology. Great collaboration driven by @manuFernandezBu @enzoferrante @arg_epilab. Happy to have brought a small contribution! Preprint and detailed thread below :

Manuel Fernández

@manuFernandezBu

2 months

Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)

0

5

17

Federico Ariel

@arg_epilab

2 months

🌱🧬New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientists from 19 countries! Amazing collaboration! 🚀 More info in this thread 👇 https://t.co/xujvRjUFj0

Manuel Fernández

@manuFernandezBu

2 months

Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)

3

39

112

Laboratorio de Inteligencia Artificial Aplicada

@liaa_icc

2 months

🪴 📷🤖Nuevo preprint: What do LLMs know about Plant Molecular Biology? Les presentamos MoBiPlant, nuestro nuevo benchmark para medir el rendimiento de los LLM creado por más de 100 científicos de plantas de 19 países! 👇

Manuel Fernández

@manuFernandezBu

2 months

Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)

0

2

3

Enzo Ferrante

@enzoferrante

2 months

🪴 🤖 New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientist from 19 countries! More info in this thread 👇 https://t.co/DlRYZCX00x

Manuel Fernández

@manuFernandezBu

2 months

Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)

1

12

46

Manuel Fernández

@manuFernandezBu

2 months

Grateful to be part of this collective effort — huge thanks to everyone involved! 🌱✨ 🙌 Special thanks to Enzo @enzoferrante and Fede @arg_epilab for guiding the work, and to Nico @ngaggion and Luci @luviferrero for building this side by side.

0

7

Manuel Fernández

@manuFernandezBu

2 months

🤔LLMs tend to choose the first option. Popular LLMs hit 75%+ on our MCQs, but many default to option A when unsure, supporting previous literature. We quantify this with shuffled permutations and show that although some models are more robust, most present option bias.

1

0

3

Manuel Fernández

@manuFernandezBu

2 months

⚠️❗Expert reviews expose critical failure modes despite high MCQ scores We uncover moderate factual alignment, frequent omissions, hallucinations, and low self-awareness. We catalog concrete pitfalls (species confusion, cross-domain bias, outdated knowledge, wrong references).

1

0

4

Manuel Fernández

@manuFernandezBu

2 months

👀🔍LLM strength tracks canon, not the frontier. We found that model performance rises as the facts being asked about rely on highly cited sources, and questions about review articles outscore those from research articles by ~10–15 pts.

1

0

4

Manuel Fernández

@manuFernandezBu

2 months

🌱🤖 MoBiPlant is the first benchmark of LLMs for plant molecular biology. - Built by 112 experts across 19 countries. - Packs 565 expert-curated MCQs + 1,075 synthetic items. - Tests LLMs knowledge spanning gene regulation to plant-environment interactions.

1

0

4

Manuel Fernández

@manuFernandezBu

2 months

Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)

3

20

30

Sara Hooker

@sarahookr

7 months

Very proud to introduce Kaleidoscope ✨🌿 🌍 18 languages (Bengali → Spanish) 📚 14 subjects (Humanities → STEM) 📸 55% requiring image understanding! A very important open science collaboration — which extends in-language evaluation for vision models to many more languages.

Cohere Labs

@Cohere_Labs

7 months

🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation

4

30

132

Enzo Ferrante

@enzoferrante

7 months

Kaleidoscope is out 🌈! An in-language multimodal multilingual exams dataset created to evaluate VLMs capabilities This is the result of a great multi-institutional collaboration led by @CohereForAI Special congrats to @manuFernandezBu for this first publication!

Cohere Labs

@Cohere_Labs

7 months

🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation

5

8

39

Manuel Fernández

@manuFernandezBu

7 months

Many thanks to everyone involved in creating this benchmark --especially to those who carefully extracted the data. This work really sets us closer towards building inclusive and culturally representative AI.

0

11

Manuel Fernández

@manuFernandezBu

7 months

Check it out!👇 Arxiv: https://t.co/FteIEiFdWY HF: https://t.co/R8F4zt5wKf Website (navigate through the data!):

huggingface.co

1

13

Manuel Fernández

@manuFernandezBu

7 months

I'm excited to announce the release of Kaleidoscope! A multimodal multilingual benchmark composed of 20,911 real-world questions: 🗣️ 18 languages 📚 14 subjects 📸 55% multimodal questions

7

15

31