Manuel Fernández Profile
Manuel Fernández

@manuFernandezBu

Followers
51
Following
33
Media
5
Statuses
17

Buenos Aires, Argentina
Joined April 2025
Don't wanna be here? Send us removal request.
@liaa_icc
Laboratorio de Inteligencia Artificial Aplicada
3 days
Nuevo preprint de @manuFernandezBu y equipo What Large Language Models Know About Plant Molecular Biology https://t.co/nScN223kp2 Manuel Fernandez Burda, Lucia Ferrero, Nicolás Gaggion, Camille Fonouni-Farde, MoBiPlant Consortium, Martín Crespi, Federico Ariel, Enzo Ferrante
Tweet card summary image
biorxiv.org
Large language models (LLMs) are rapidly permeating scientific research, yet their capabilities in plant molecular biology remain largely uncharacterized. Here, we present MOBIPLANT, the first...
1
3
8
@Matt_Ben_
Matthias Benoit
5 days
🧬🌱New preprint! MoBiPlant is a benchmark built with 112 experts to test how LLMs understand plant molecular biology. Great collaboration driven by @manuFernandezBu @enzoferrante @arg_epilab. Happy to have brought a small contribution! Preprint and detailed thread below :
@manuFernandezBu
Manuel Fernández
6 days
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
Tweet media one
0
3
11
@arg_epilab
Federico Ariel
6 days
🌱🧬New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientists from 19 countries! Amazing collaboration! 🚀 More info in this thread 👇 https://t.co/xujvRjUFj0
Tweet media one
@manuFernandezBu
Manuel Fernández
6 days
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
Tweet media one
3
41
112
@liaa_icc
Laboratorio de Inteligencia Artificial Aplicada
6 days
🪴 📷🤖Nuevo preprint: What do LLMs know about Plant Molecular Biology? Les presentamos MoBiPlant, nuestro nuevo benchmark para medir el rendimiento de los LLM creado por más de 100 científicos de plantas de 19 países! 👇
@manuFernandezBu
Manuel Fernández
6 days
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
Tweet media one
0
2
4
@enzoferrante
Enzo Ferrante
6 days
🪴 🤖 New preprint: What do LLMs know about Plant Molecular Biology? Take a look at MoBiPlant, our new benchmark to measure LLMs performance built by more than 100 plant scientist from 19 countries! More info in this thread 👇 https://t.co/DlRYZCX00x
Tweet media one
@manuFernandezBu
Manuel Fernández
6 days
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
Tweet media one
1
12
45
@manuFernandezBu
Manuel Fernández
6 days
Grateful to be part of this collective effort — huge thanks to everyone involved! 🌱✨ 🙌 Special thanks to Enzo @enzoferrante and Fede @arg_epilab for guiding the work, and to Nico @ngaggion and Luci @luviferrero for building this side by side.
0
0
7
@manuFernandezBu
Manuel Fernández
6 days
🤔LLMs tend to choose the first option. Popular LLMs hit 75%+ on our MCQs, but many default to option A when unsure, supporting previous literature. We quantify this with shuffled permutations and show that although some models are more robust, most present option bias.
Tweet media one
1
0
3
@manuFernandezBu
Manuel Fernández
6 days
⚠️❗Expert reviews expose critical failure modes despite high MCQ scores We uncover moderate factual alignment, frequent omissions, hallucinations, and low self-awareness. We catalog concrete pitfalls (species confusion, cross-domain bias, outdated knowledge, wrong references).
Tweet media one
1
0
4
@manuFernandezBu
Manuel Fernández
6 days
👀🔍LLM strength tracks canon, not the frontier. We found that model performance rises as the facts being asked about rely on highly cited sources, and questions about review articles outscore those from research articles by ~10–15 pts.
Tweet media one
1
0
4
@manuFernandezBu
Manuel Fernández
6 days
🌱🤖 MoBiPlant is the first benchmark of LLMs for plant molecular biology. - Built by 112 experts across 19 countries. - Packs 565 expert-curated MCQs + 1,075 synthetic items. - Tests LLMs knowledge spanning gene regulation to plant-environment interactions.
Tweet media one
1
0
4
@manuFernandezBu
Manuel Fernández
6 days
Large language models are reshaping the way we do science, but how well do they actually understand plant molecular biology? ➡️We created MoBiPlant to answer this. 📝Preprint: https://t.co/EhQf4YofLw 💾Dataset: https://t.co/D1B5lBj0UR (Thread below)
Tweet media one
3
19
28
@sarahookr
Sara Hooker
5 months
Very proud to introduce Kaleidoscope ✨🌿 🌍 18 languages (Bengali → Spanish) 📚 14 subjects (Humanities → STEM) 📸 55% requiring image understanding! A very important open science collaboration — which extends in-language evaluation for vision models to many more languages.
@Cohere_Labs
Cohere Labs
5 months
🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation
Tweet media one
4
30
132
@enzoferrante
Enzo Ferrante
5 months
Kaleidoscope is out 🌈! An in-language multimodal multilingual exams dataset created to evaluate VLMs capabilities This is the result of a great multi-institutional collaboration led by @CohereForAI Special congrats to @manuFernandezBu for this first publication!
Tweet media one
@Cohere_Labs
Cohere Labs
5 months
🚀 We are excited to introduce Kaleidoscope, the largest culturally-authentic exam benchmark. 📌 Most VLM benchmarks are English-centric or rely on translations—missing linguistic & cultural nuance. Kaleidoscope expands in-language multilingual 🌎 & multimodal 👀 VLMs evaluation
Tweet media one
5
8
38
@manuFernandezBu
Manuel Fernández
5 months
Many thanks to everyone involved in creating this benchmark --especially to those who carefully extracted the data. This work really sets us closer towards building inclusive and culturally representative AI.
0
0
11
@manuFernandezBu
Manuel Fernández
5 months
Check it out!👇 Arxiv: https://t.co/FteIEiFdWY HF: https://t.co/R8F4zt5wKf Website (navigate through the data!):
Tweet card summary image
huggingface.co
1
1
13
@manuFernandezBu
Manuel Fernández
5 months
I'm excited to announce the release of Kaleidoscope! A multimodal multilingual benchmark composed of 20,911 real-world questions: 🗣️ 18 languages 📚 14 subjects 📸 55% multimodal questions
7
15
31