mrm8488 Profile Banner
Manu Romero Profile
Manu Romero

@mrm8488

Followers
20K
Following
50K
Media
3K
Statuses
46K

CSO/Co-founder @maisaAI_. Head Contrib/ Ambassador🤗 @huggingface. Research 🌸@bigsciencew/@BigCodeProject | ex @narrativaAI

Murcia - SF
Joined January 2011
Don't wanna be here? Send us removal request.
@mrm8488
Manu Romero
9 years
Nuestra joya P. R. E.
5
3
63
@mrm8488
Manu Romero
3 days
Diving into the @allen_ai Olmo 3 paper this weekend 📖🧠
0
0
5
@tmgindustrialUS
TMG Industrial USA
26 days
Work smarter and save more with TMG Industrial. 1-Year Warranty Free Shipping to Lower 48 States $100 OFF first order over $3,000 Reliable shelter, storage & equipment solutions built for the long haul.
0
0
2
@mrm8488
Manu Romero
7 days
It's mmBERT fine-tuned on Spanish instructions dataset using dllm repo
1
0
3
@mrm8488
Manu Romero
7 days
Let's make Encoders-only great again 😉 #MEGA #DLLM
1
0
7
@mrm8488
Manu Romero
8 days
The Vice President of the Government of Spain says that artificial intelligence rebels against humans when it knows it is going to be shut down… and that this is happening… 🤦🏽‍♂️🤦🏽‍♂️🤦🏽‍♂️🤦🏽‍♂️
@diostuitero
Dios
8 days
Me cago en mi puta vida.
3
0
11
@alec_helbling
Alec Helbling
11 days
Hamiltonian Monte Carlo frames sampling from a probability distribution as a physics problem. By endowing "particles" with momentum and simulating their energy and motion through Hamilton's equations you can efficiently explore a distribution.
31
243
2K
@mrm8488
Manu Romero
12 days
I am looking for a Senior Applied AI Engineer. Core skills include: - Context Engineering - Prompt Evaluation - Popular Python AI Frameworks If you think you are a good candidate, my DMs are open!
3
3
6
@mrm8488
Manu Romero
16 days
It could be something like creating a mental representation of an "idea" whose "embedding" is close to the embedding of the original idea
0
1
3
@SapirHarary
Sapir Harary
18 days
🚨 New paper alert! We’re thrilled to share our new preprint “PrefixNLI: Detecting Factual Inconsistencies as Soon as They Arise” ✨ LLMs generate text one token at a time, but factuality checks still wait for a full sentence. We extend NLI to text prefixes, enabling the
4
17
65
@mrm8488
Manu Romero
18 days
Scaling companies is much more difficult than everything else
1
0
1
@QPHutu
Penghui Qi
25 days
🚀Excited to share our new work! 💊Problem: The BF16 precision causes a large training-inference mismatch, leading to unstable RL training. 💡Solution: Just switch to FP16. 🎯That's it. 📰Paper: https://t.co/AjCjtWquEq ⭐️Code: https://t.co/hJWSlch4VN
20
108
656
@_lewtun
Lewis Tunstall
26 days
We've just published the Smol Training Playbook: a distillation of hard earned knowledge to share exactly what it takes to train SOTA LLMs ⚡️ Featuring our protagonist SmolLM3, we cover: 🧭 Strategy on whether to train your own LLM and burn all your VC money 🪨 Pretraining,
20
84
461
@abhi1thakur
abhishek
1 month
this could be because its a publicly available text and it was trained on it?
@deedydas
Deedy
1 month
DeepSeek-OCR is the best OCR ever. It parses this extremely hard to read handwritten letter written by mathematician Ramanujan in 1913 with a frightening degree of accuracy. Not perfect, but beats former best dots ocr. Bonus points if you can spot the errors. Try it here:
2
4
91
@mrm8488
Manu Romero
1 month
👀
@WenhuChen
Wenhu Chen
1 month
Had some really interesting discoveries recently: If a model performs extremely stable on one benchmark. Let's say a model is always getting 62% on SWEBench no matter what prompts or scaffold you used. It DOES NOT mean that the model is robust. It actually means that the model
0
0
0
@RuiyiWang153
Ruiyi Wang 王睿仪
1 month
🔥Excited to share our new work: "A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning"! We systematically study what actually works (and what doesn't) for agentic multi-turn RL, breaking down the design space into 3 pillars: 🌎Environment, 🤖Policy, and ⭐Reward.
2
30
172
@mrm8488
Manu Romero
1 month
Trying to reproduce it with a Spanish RoBERTa...
@karpathy
Andrej Karpathy
1 month
Nice, short post illustrating how simple text (discrete) diffusion can be. Diffusion (i.e. parallel, iterated denoising, top) is the pervasive generative paradigm in image/video, but autoregression (i.e. go left to right bottom) is the dominant paradigm in text. For audio I've
0
0
4
@mrm8488
Manu Romero
1 month
When your goal is bringing agentic workflows to production and making them bulletproof, @karpathy's words are like: yeah, seems we are on the right track!
0
0
4
@mrm8488
Manu Romero
1 month
Don’t hire managers for your team. Hire talent and let leadership emerge naturally.
0
0
3
@karpathy
Andrej Karpathy
1 month
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
667
3K
24K
@BdsLoick
Loïck BOURDOIS
1 month
New blog post analyzing the top 50 entities with the most downloaded models on @huggingface 🤗! The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!). Some key findings:
7
24
126
@mrm8488
Manu Romero
1 month
♥️🐴
@La_SER
Cadena SER
1 year
🎥 #12DeOctubre | Se mantiene la participación de las unidades a caballo. Estaba en duda si estas agrupaciones podrían participar en el desfile terrestre a causa de la lluvia. Las unidades a caballo son las encargadas de cerrar el desfile terrestre https://t.co/RSmXESe2gZ
0
0
2