Manuel Fernández Profile
Manuel Fernández

@manuFernandezBu

Followers
51
Following
36
Media
5
Statuses
18

Buenos Aires, Argentina
Joined April 2025
Don't wanna be here? Send us removal request.
@sci_plant
Plant Science
2 months
What Large Language Models Know About Plant Molecular Biology https://t.co/pjvuC9fTgi ♻️
0
4
17
@liaa_icc
Laboratorio de Inteligencia Artificial Aplicada
2 months
Nuevo preprint de @manuFernandezBu y equipo What Large Language Models Know About Plant Molecular Biology https://t.co/nScN223kp2 Manuel Fernandez Burda, Lucia Ferrero, Nicolás Gaggion, Camille Fonouni-Farde, MoBiPlant Consortium, Martín Crespi, Federico Ariel, Enzo Ferrante
1
3
9
@Matt_Ben_
Matthias Benoit
2 months
🧬🌱New preprint! MoBiPlant is a benchmark built with 112 experts to test how LLMs understand plant molecular biology. Great collaboration driven by @manuFernandezBu @enzoferrante @arg_epilab. Happy to have brought a small contribution! Preprint and detailed thread below :
@manuFernandezBu
Manuel Fernández
2 months
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
0
5
17
@arg_epilab
Federico Ariel
2 months
🌱🧬New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientists from 19 countries! Amazing collaboration! 🚀 More info in this thread 👇 https://t.co/xujvRjUFj0
@manuFernandezBu
Manuel Fernández
2 months
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
3
39
112
@liaa_icc
Laboratorio de Inteligencia Artificial Aplicada
2 months
🪴 📷🤖Nuevo preprint: What do LLMs know about Plant Molecular Biology? Les presentamos MoBiPlant, nuestro nuevo benchmark para medir el rendimiento de los LLM creado por más de 100 científicos de plantas de 19 países! 👇
@manuFernandezBu
Manuel Fernández
2 months
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
0
2
3
@enzoferrante
Enzo Ferrante
2 months
🪴 🤖 New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientist from 19 countries! More info in this thread 👇 https://t.co/DlRYZCX00x
@manuFernandezBu
Manuel Fernández
2 months
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
1
12
46
@manuFernandezBu
Manuel Fernández
2 months
Grateful to be part of this collective effort — huge thanks to everyone involved! 🌱✨ 🙌 Special thanks to Enzo @enzoferrante and Fede @arg_epilab for guiding the work, and to Nico @ngaggion and Luci @luviferrero for building this side by side.
0
0
7
@manuFernandezBu
Manuel Fernández
2 months
🤔LLMs tend to choose the first option. Popular LLMs hit 75%+ on our MCQs, but many default to option A when unsure, supporting previous literature. We quantify this with shuffled permutations and show that although some models are more robust, most present option bias.
1
0
3
@manuFernandezBu
Manuel Fernández
2 months
⚠️❗Expert reviews expose critical failure modes despite high MCQ scores We uncover moderate factual alignment, frequent omissions, hallucinations, and low self-awareness. We catalog concrete pitfalls (species confusion, cross-domain bias, outdated knowledge, wrong references).
1
0
4
@manuFernandezBu
Manuel Fernández
2 months
👀🔍LLM strength tracks canon, not the frontier. We found that model performance rises as the facts being asked about rely on highly cited sources, and questions about review articles outscore those from research articles by ~10–15 pts.
1
0
4
@manuFernandezBu
Manuel Fernández
2 months
🌱🤖 MoBiPlant is the first benchmark of LLMs for plant molecular biology. - Built by 112 experts across 19 countries. - Packs 565 expert-curated MCQs + 1,075 synthetic items. - Tests LLMs knowledge spanning gene regulation to plant-environment interactions.
1
0
4
@manuFernandezBu
Manuel Fernández
2 months
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
3
20
30
@sarahookr
Sara Hooker
7 months
Very proud to introduce Kaleidoscope ✨🌿 🌍 18 languages (Bengali → Spanish) 📚 14 subjects (Humanities → STEM) 📸 55% requiring image understanding! A very important open science collaboration — which extends in-language evaluation for vision models to many more languages.
@Cohere_Labs
Cohere Labs
7 months
🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation
4
30
132
@enzoferrante
Enzo Ferrante
7 months
Kaleidoscope is out 🌈! An in-language multimodal multilingual exams dataset created to evaluate VLMs capabilities This is the result of a great multi-institutional collaboration led by @CohereForAI Special congrats to @manuFernandezBu for this first publication!
@Cohere_Labs
Cohere Labs
7 months
🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation
5
8
39
@manuFernandezBu
Manuel Fernández
7 months
Many thanks to everyone involved in creating this benchmark --especially to those who carefully extracted the data. This work really sets us closer towards building inclusive and culturally representative AI.
0
0
11
@manuFernandezBu
Manuel Fernández
7 months
Check it out!👇 Arxiv: https://t.co/FteIEiFdWY HF: https://t.co/R8F4zt5wKf Website (navigate through the data!):
Tweet card summary image
huggingface.co
1
1
13
@manuFernandezBu
Manuel Fernández
7 months
I'm excited to announce the release of Kaleidoscope! A multimodal multilingual benchmark composed of 20,911 real-world questions: 🗣️ 18 languages 📚 14 subjects 📸 55% multimodal questions
7
15
31