Warvito Profile Banner
Walter Hugo Lopez Pinaya 🍍 Profile
Walter Hugo Lopez Pinaya 🍍

@Warvito

Followers
1K
Following
5K
Media
49
Statuses
3K

Senior Research Engineer @synthesiaIO | Ex-Research Fellow @KingsCollegeLon Text-to-Video | Generative Models | Medical Imaging

London, UK
Joined October 2009
Don't wanna be here? Send us removal request.
@akshay_pachaar
Akshay 🚀
2 days
Google just dropped "Attention is all you need (V2)" This paper could solve AI's biggest problem: Catastrophic forgetting. When AI models learn something new, they tend to forget what they previously learned. Humans don't work this way, and now Google Research has a solution.
247
985
6K
@HuggingPapers
DailyPapers
2 days
MIT introduces "Back to Basics: Let Denoising Generative Models Denoise" Shows that simple, large-patch Transformers on pixels, dubbed JiTs (Just Image Transformers), can be strong generative models
3
81
539
@TencentHunyuan
Hunyuan
4 days
We are excited to unveil HunyuanVideo 1.5, the strongest open-source video generation model. Built upon DiT architecture, it redefines the open-source SOTA for accessibility and performance.🚀🚀🚀 HunyuanVideo 1.5 delivers state-of-the-art visual quality and motion coherence
39
181
1K
@IntuitMachine
Carlos E. Perez
6 days
I finally have a better understanding of Yann LeCun's JEPA approach and why he may have quit Meta! I think it might fix one of the most annoying, hacky parts of training foundation models. What if 90% of the tricks we use to train big AI models are just complicated workarounds
21
47
335
@SeunghyunSEO7
Seunghyun Seo
5 days
when using muon, you should be careful 1. what lr scaling rule should i use for muon? 2. how the scale of muon and adamw differs? (it depends on 1...) 3. what's optimal lr scale of muon? understanding spectral condition will help.. https://t.co/OVeBR1NGSv https://t.co/oanCF3G4ce
@Jianlin_S
jianlin.su
5 days
Muon Optimizer Guide: Quick Start & Key Details https://t.co/n41aqjCFJU
5
16
198
@kwangmoo_yi
Kwang Moo Yi
6 days
Vecchio et al, "Φeat: Physically-Grounded Feature Representation" Foundational backbone, finetuned DINOv3, trained with synthetic renders of materials, EMA student-teacher training with multiple losses.
6
48
429
@francoisfleuret
François Fleuret
6 days
@francoisfleuret
François Fleuret
6 days
I went to bed with three runs where I removed the only hacky part in my model, making it mathematically correct. This is what the Gods told me during the night to continue my journey.
6
2
138
@jiqizhixin
机器之心 JIQIZHIXIN
7 days
Huge! @TianhongLi6 & Kaiming He (inventor of ResNet) just Introduced JiT (Just image Transformers)! JiTs are simple large-patch Transformers that operate on raw pixels, no tokenizer, pre-training, or extra losses needed. By predicting clean data on the natural-data manifold,
8
118
759
@LingYang_PU
Ling Yang
7 days
Thanks @_akhaliq for introducing our MMaDA-Parallel ( https://t.co/IbsuR9m4tH), Parallel Multimodal Large Diffusion Language Models for Thinking-Aware Image Editing and Generation Paper: https://t.co/IbsuR9m4tH Code:
Tweet card summary image
github.com
Official Implementation of "MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation" - tyfeld/MMaDA-Parallel
@_akhaliq
AK
7 days
MMaDA-Parallel Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation
2
29
173
@NielsRogge
Niels Rogge
8 days
This is a phenomenal video by @jbhuang0604 explaining seminal papers in computer vision, including CLIP, SimCLR, DINO v1/v2/v3 in 15 minutes DINO is actually a brilliant idea, I found the decision of 65k neurons in the output head pretty interesting
14
125
1K
@SeanMcleish
Sean McLeish ✈️ NeurIPS
13 days
Looped latent reasoning models like TRM, HRM, Ouro and Huginn are great for reasoning, but they’re inefficient to train at larger scales. We fix this by post training regular language models into looped models, achieving higher accuracy on a per training FLOP basis. 📜1/7
10
65
385
@jacobbamb
Jacob Bamberger
13 days
Flow Matching models often struggle to balance memorization and generalization. 😱 We set out to fix this — by using the geometry of the data manifold. Introducing Carré du Champ Flow Matching (CDCFM)🧑‍🎨🥖 — improving generalization without sacrificing sample quality.
11
63
436
@leonklein26
Leon Klein
15 days
(1/n) Can diffusion models simulate molecular dynamics instead of generating independent samples? In our NeurIPS2025 paper, we train energy-based diffusion models that can do both: - Generate independent samples - Learn the underlying potential 𝑼 🧵👇 https://t.co/TSurVY3YEl
12
140
838
@rohanpaul_ai
Rohan Paul
18 days
TabTune makes tabular AI models easy to try, compare, and trust. It hides messy prep and gives 1 simple fit, predict, evaluate flow. Work on tables is messy because every model wants different preprocessing, training modes, and metrics. This paper's technique supports 7
5
4
14
@shumpeiMaxwell
Shumpei Takezaki
20 days
Diffusion Transformers with Representation Autoencoders https://t.co/tg1XG46YoI
speakerdeck.com
https://arxiv.org/abs/2510.11690
0
37
248
@ModelScope2022
ModelScope
21 days
🚀 Training 64K+ context LLMs on consumer GPUs? Now possible with Ulysses + Ring Attention! We’ve fused two sequence parallelism techniques in ModelScope SWIFT: ✅ Ulysses: Low-comm, head-split (but limited by # of attention heads) ✅ Ring Attention: Scales beyond head count
4
28
136
@Hesamation
ℏεsam
25 days
holy shit... Hugging Face cooked again! 🔥 they just dropped a free blog (BOOK) that covers the no-bs reality of building SOTA models. i haven't seen any lab/researcher go into the real decisions behind the LLM research and its nuances. this is literally a gem. Syllabus: →
25
206
2K
@goyal__pramod
Pramod Goyal
26 days
If you are interested in Diffusion models. And wish to understand them in depth. This might be the best resource out there!
@JCJesseLai
Chieh-Hsin (Jesse) Lai ✈️ NeurIPS
27 days
Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core
7
52
681
@JCJesseLai
Chieh-Hsin (Jesse) Lai ✈️ NeurIPS
27 days
Tired to go back to the original papers again and again? Our monograph: a systematic and fundamental recipe you can rely on! 📘 We’re excited to release 《The Principles of Diffusion Models》— with @DrYangSong, @gimdong58085414, @mittu1204, and @StefanoErmon. It traces the core
44
443
2K
@fly51fly
fly51fly
1 month
[CV] Accelerating Vision Transformers with Adaptive Patch Sizes R Choudhury, J Kim, J Park, E Yang... [CMU & KAIST] (2025) https://t.co/zsX7D5B30G
0
36
252