Prithiv Sakthi @prithivMLmods X Profile

Prithiv Sakthi

@prithivMLmods

Followers

348

Following

12K

Media

244

Statuses

7K

computer vision | multimodal | @huggingface fellow 🤗

https://t.co/ZPWbsBo6Jx

Alpha Centauri

Joined October 2022

Don't wanna be here? Send us removal request.

Prithiv Sakthi

@prithivMLmods

3 days

LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX @Lightricks

6

9

96

kyutai

@kyutai_labs

14 hours

We’re excited to introduce Pocket TTS: a 100M-parameter text-to-speech model with high-quality voice cloning that runs on your laptop—no GPU required. Open-source, lightweight, and incredibly fast. 🧵👇

55

276

2K

Tongyi Lab

@Ali_TongyiLab

19 hours

High-res restoration shouldn't be a struggle. @prithivMLmods nailed it with this new adapter for Qwen-Image-Edit-2511. It unblurs and upscales with insane visual consistency across edges and colors, all in just 4 steps. Appreciate the heavy lifting!

5

15

194

Claude

@claudeai

1 day

Introducing Cowork: Claude Code for the rest of your work. Cowork lets you complete non-technical tasks much like how developers use Claude Code.

2K

8K

80K

Gradio

@Gradio

2 days

Beautiful app!

Prithiv Sakthi

@prithivMLmods

3 days

LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX @Lightricks

0

16

142

DailyPapers

@HuggingPapers

4 days

Tencent just released HY-Video-PRFL on Hugging Face Turns video generation models into latent reward models for efficient preference optimization—1.4x faster training, 56% motion quality boost, 67GB VRAM for 14B models.

2

9

45

Prithiv Sakthi

@prithivMLmods

3 days

LTX-2 Camera-Control with Dolly-in/out and Dolly-left/right LoRA demo is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. 🤗Try it now on @huggingface: https://t.co/stTJIoQcGd 🎥Dolly-In: https://t.co/F6RXq3YJSX @Lightricks

6

9

96

Prithiv Sakthi

@prithivMLmods

3 days

🎥Camera-Control-Static: https://t.co/N8w6mAi1yX

0

2

Prithiv Sakthi

@prithivMLmods

3 days

🎥Jib-Up: https://t.co/jKyelDyCTe

1

0

2

Prithiv Sakthi

@prithivMLmods

3 days

🎥Jib-Down: https://t.co/gljx3DrTTx

1

0

3

Prithiv Sakthi

@prithivMLmods

3 days

🎥Dolly-Right: https://t.co/8DOpnr7Tlw

1

0

2

Prithiv Sakthi

@prithivMLmods

3 days

🎥Dolly-Left: https://t.co/CJIEDbvwCD

1

2

Prithiv Sakthi

@prithivMLmods

3 days

🎥Dolly-Out: https://t.co/kNavDiIduy

1

0

2

まゆひらa

@riddi0908

4 days

prithivMLmods/Qwen-Image-Edit-2511-Polaroid-Photo https://t.co/jQoCM4XtC5 (see the link for prompts)

huggingface.co

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

0

3

5

Junyang Lin

@JustinLin610

5 days

yes we extended vl to vl embedding. we do believe that in the future the embedding model should be by default multimodal. a practical stuff worth a try!

Qwen

@Alibaba_Qwen

6 days

🚀 Introducing Qwen3-VL-Embedding and Qwen3-VL-Reranker – advancing the state of the art in multimodal retrieval and cross-modal understanding! ✨ Highlights: ✅ Built upon the robust Qwen3-VL foundation model ✅ Processes text, images, screenshots, videos, and mixed modality

10

30

351

Z.ai

@Zai_org

5 days

We’re officially public. (HKEX: 02513) To everyone who has supported GLM, built with it, tested it, or simply followed along. Thank you.❤️ This moment belongs to our community as much as it belongs to us. To celebrate, we’re opening a 48-hour community challenge.❤️‍🔥❤️‍🔥❤️‍🔥 48

392

325

3K

Qwen

@Alibaba_Qwen

6 days

🚀 Introducing Qwen3-VL-Embedding and Qwen3-VL-Reranker – advancing the state of the art in multimodal retrieval and cross-modal understanding! ✨ Highlights: ✅ Built upon the robust Qwen3-VL foundation model ✅ Processes text, images, screenshots, videos, and mixed modality

45

300

2K

Maxime Labonne

@maximelabonne

6 days

Good night, sweet prince. It's been an amazing run. 🫡

14

9

195

ModelScope

@ModelScope2022

7 days

LTX-2 is trending! 🔥 Users are raving about its incredible motion consistency and prompt sensitivity. To keep the momentum going, we’ve just launched the LTX-2 Distilled demo on ModelScope! 🚀 https://t.co/E9qQxYJPcb Experience pro-level video generation with optimized speed.

0

6

36

elie

@eliebakouch

8 days

nvidia is using deepseek and kimi k2 thinking to showcase the performance of their new chips, great taste 💚

15

16

289