#multimodal X Hashtag

Explore tweets tagged as #multimodal

Akshay 🚀

@akshay_pachaar

6 days

if you're looking for a comprehensive guide to LLM finetuning, check this! a free 115-page book on arxiv, covering: > fundamentals of LLM > peft (lora, qlora, dora, hft) > alignment methods (ppo, dpo, grpo) > mixture of experts (MoE) > 7-stage fine-tuning pipeline > multimodal

38

214

953

Sumanth

@Sumanth_077

8 days

All-in-One RAG System! RAG-Anything is a unified framework with a multi-stage multimodal pipeline that extends traditional RAG architectures. 100% Open Source

20

244

1K

Lukias.

@lukias0

4 days

🧠 Google just teased Gemini 3.0 🔹 Enhanced reasoning with “Deep Think” 🔹 Multimodal: video, 3D, geospatial data 🔹 Larger context windows for longer conversations 🔹 Smarter code generation Release expected Nov 12, 2025 #GoogleAI #Gemini3 #AI

5

1

18

Ansong Ni

@AnsongNi

5 days

Our team at FAIR is hiring PhD research interns for 2026 on the topics of multimodal multi-agent learning. If you are interested, feel free to DM me or directly apply using the link below! https://t.co/JrHoDAPDnP

2

32

187

Alex Prompter

@alex_prompter

13 hours

🔥 Holy shit... Apple just did something nobody saw coming They just dropped Pico-Banana-400K a 400,000-image dataset for text-guided image editing that might redefine multimodal training itself. Here’s the wild part: Unlike most “open” datasets that rely on synthetic

36

107

650

Ray Fernando

@RayFernando1337

6 days

This is the JPEG moment for AI. Optical compression doesn't just make context cheaper. It makes AI memory architectures viable. Training data bottlenecks? Solved. - 200k pages/day on ONE GPU - 33M pages/day on 20 nodes - Every multimodal model is data-constrained. Not anymore.

111

724

6K

Rohan Paul

@rohanpaul_ai

7 days

👨‍🔧 Github: RAG-Anything: All-in-One RAG Framework 7.6k Stars ⭐️ All-in-One Multimodal Document Processing RAG system built on LightRAG. You can query documents containing interleaved text, visual diagrams, structured tables, and mathematical formulations through one interface.

15

137

968

Sidhartha

@SidharthaGarg

2 days

🚀Absolutely thrilled to share that our team, royal_recruits, came in rank 4 amongst ~86,000 registrations in the Amazon ML Challenge. The task: predict product prices from just text and images. This was a deep dive into multimodal learning. Here's a thread on how we built it. 🧵

12

2

43

Haider.

@slow_developer

8 days

Andrej Karpathy says today's agents aren't ready to work like real coworkers or interns They lack intelligence, can't use computers, aren't multimodal, lack continual learning, and forget what you tell them Fixing these gaps will take about a decade

200

302

3K

PJ

@Prithvir12

6 days

If this Karpathy interview doesn't pop the ai bubble, nothing will. 10 brutal quotes: 1. LLMs don’t work yet They don’t have enough intelligence, they’re not multimodal enough, they can’t use computers, and they don’t remember what you tell them. They’re cognitively lacking.

316

837

6K

Bhargavi Paranjape

@bvp22294

5 days

📢 PhD Students in GenAI/RL! Our team at FAIR is hiring a Research Intern for Summer 2026 to push the boundaries of multimodal multi-agent social interaction. Learn more and apply: https://t.co/7P66mnEY97

7

48

316

Asli Celikyilmaz

@real_asli

9 days

🚀 Exciting opportunity! We are hiring research interns (current PhD students) at @Meta FAIR to advance multi-agent, multimodal AI! Work on text, audio, images & more, collaborate with top mentors, and help shape the future of AI at scale. Apply:

2

43

220

Manish Kumar Shah

@manishkumar_dev

12 hours

Dreamina 4.0 — the next-generation multimodal AI design model. From text-to-image creation and smart editing to large-scale content generation — Dreamina brings every idea to life through natural language. Your creativity, amplified. ⚡ Prompt: Turn the figure in the image

76

54

154

Hirofumi Inaguma

@HirofumiInaguma

2 days

I was impacted by FAIR layoff this time. I'm looking for a new position on speech, multimodal, 3D human motion, and social behavior modeling. Happy to chat more details:)

16

42

247

Amey Sankhe

@Amey___14

5 days

🌍✈️ Meet my Multimodal Travel Assistant - an AI agent that makes trip planning smarter! 🚀 🗺️ Creates custom travel plans 🎙️ Talks with audio replies 🎨 Generates pop-art travel images 💻 Built with GPT & Gradio #AI #MultimodalAI #GenerativeAI #TravelTech #GPT #Gradio

0

2

Alex Tong

@AlexanderTong7

5 days

#AITHYRA, Vienna's new Biomedical AI institute, is hiring Postdocs! Come work with us. Openings in: 🔹 Generative AI 🔹 Multimodal ML 🔹 Virology 🔹 Enzyme Function Apply by Nov 20: https://t.co/8jNpkhdw1x #PostDoc #AI #ML #Vienna #ScienceJobs

1

14

55

Koki Nagano

@luminohope

5 days

Our team at #NVIDIA Research is hiring summer intern 2026 on areas including Video Generative models, Controllable/Physically-grounded (3D/4D) GenAI, human-robot/agent interaction (e.g., multimodal LLM). Please email me with a CV if interested.

12

38

374

Mamoona

@Content_VA

9 days

Top 10 ChatGPT Alternatives (2025) 1. Claude (Anthropic) – Smart, safe, great for long docs. 2. Google Gemini – Strong search + multimodal power. 3. Microsoft Copilot – Best for Office, writing & workflow. 4. Perplexity AI – Research + chat with real sources. 5. DeepSeek –

0

4

valeo.ai

@valeoai

8 days

Our recent research will be presented at #ICCV2025 @ICCVConference! We’ll present 5 papers about: 💡 self-supervised & representation learning 🌍 3D occupancy & multi-sensor perception 🧩 open-vocabulary segmentation 🧠 multimodal LLMs & explainability https://t.co/Tg0Vx3oS94

1

7

19