Prithiv Sakthi
@prithivMLmods
Followers
348
Following
12K
Media
244
Statuses
7K
computer vision | multimodal | @huggingface fellow 🤗
Alpha Centauri
Joined October 2022
LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX
@Lightricks
6
9
96
We’re excited to introduce Pocket TTS: a 100M-parameter text-to-speech model with high-quality voice cloning that runs on your laptop—no GPU required. Open-source, lightweight, and incredibly fast. 🧵👇
55
276
2K
High-res restoration shouldn't be a struggle. @prithivMLmods nailed it with this new adapter for Qwen-Image-Edit-2511. It unblurs and upscales with insane visual consistency across edges and colors, all in just 4 steps. Appreciate the heavy lifting!
5
15
194
Introducing Cowork: Claude Code for the rest of your work. Cowork lets you complete non-technical tasks much like how developers use Claude Code.
2K
8K
80K
Beautiful app!
LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX
@Lightricks
0
16
142
Tencent just released HY-Video-PRFL on Hugging Face Turns video generation models into latent reward models for efficient preference optimization—1.4x faster training, 56% motion quality boost, 67GB VRAM for 14B models.
2
9
45
LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX
@Lightricks
6
9
96
prithivMLmods/Qwen-Image-Edit-2511-Polaroid-Photo https://t.co/jQoCM4XtC5 (see the link for prompts)
huggingface.co
We’re on a journey to advance and democratize artificial intelligence through open source and open science.
0
3
5
yes we extended vl to vl embedding. we do believe that in the future the embedding model should be by default multimodal. a practical stuff worth a try!
🚀 Introducing Qwen3-VL-Embedding and Qwen3-VL-Reranker – advancing the state of the art in multimodal retrieval and cross-modal understanding! ✨ Highlights: ✅ Built upon the robust Qwen3-VL foundation model ✅ Processes text, images, screenshots, videos, and mixed modality
10
30
351
We’re officially public. (HKEX: 02513) To everyone who has supported GLM, built with it, tested it, or simply followed along. Thank you.❤️ This moment belongs to our community as much as it belongs to us. To celebrate, we’re opening a 48-hour community challenge.❤️🔥❤️🔥❤️🔥 48
392
325
3K
🚀 Introducing Qwen3-VL-Embedding and Qwen3-VL-Reranker – advancing the state of the art in multimodal retrieval and cross-modal understanding! ✨ Highlights: ✅ Built upon the robust Qwen3-VL foundation model ✅ Processes text, images, screenshots, videos, and mixed modality
45
300
2K
LTX-2 is trending! 🔥 Users are raving about its incredible motion consistency and prompt sensitivity. To keep the momentum going, we’ve just launched the LTX-2 Distilled demo on ModelScope! 🚀 https://t.co/E9qQxYJPcb Experience pro-level video generation with optimized speed.
0
6
36
nvidia is using deepseek and kimi k2 thinking to showcase the performance of their new chips, great taste 💚
15
16
289