
renjie pi
@RenjiePi
Followers
215
Following
562
Media
15
Statuses
98
PhD candidate at HKUST | LLM,MLLM,Data centric AI |Apple Scholar 2024
Joined May 2020
🚀 Introducing Personalized Visual Instruction Tuning (PVIT)!.Can your MLLM recognize you? We propose a novel formulation and a data construction framework to create MLLMs that conduct personalized dialogues. 📄 Paper: 💻 Code:
2
29
97
RT @JiachengYe15: 📢 Update: Announcing Dream's next-phase development. - Dream-Coder 7B: A fully open diffusion LLM for code delivering s….
0
22
0
RT @_zhihuixie: 🚀 Thrilled to announce Dream-Coder 7B — the most powerful open diffusion code LLM to date.
0
34
0
RT @SterZhang: Excited to witness a new breakthrough in linking cues across multi-image, which shows performance boost in our VLM2-Bench!….
0
3
0
RT @SterZhang: My earlier work #VLM2Bench called for clearer principles on *when* language aids vision. Our new work **MindCube**: *First M….
0
4
0
RT @RuiYang70669025: Excited to share that EmbodiedBench was selected for an Oral at ICML 2025!. We recently added results for new models (….
0
22
0
RT @YinyaHuang: 🤖⚛️Can AI truly see Physics? Test your model with the newly released SeePhys Benchmark! 🚀. 🖼️Covering 2,000 vision-text mul….
0
16
0
RT @shizhediao: Thrilled to share my first project at NVIDIA! ✨. Today’s language models are pre-trained on vast and chaotic Internet texts….
0
55
0
RT @shujin_wu: 🐇Introducing Alice, our most recent work on advancing weak-to-strong generalization! Instead of students passively absorbing….
0
32
0
RT @JiachengYe15: 🚀Excited to announce Dream 7B (Diffusion reasoning model): the most powerful open diffusion large language model to date.….
0
205
0
RT @RickyRDWang: 🚀 Introducing MA-LoT Theorem Framework: An open-source multi-agent framework utilizing the Long Chain-of-Thought to boost….
0
9
0
RT @ZhijiangG: 🚀Exciting to see how recent advancements like OpenAI’s O1/O3 & DeepSeek’s R1 are pushing the boundaries! .Check out our late….
0
62
0
RT @SterZhang: 🚀 Introducing VLM²-Bench!. A simple yet essential ability that we use in daily life. But when tackling vision-centric tasks….
0
47
0
RT @lockonlvange: Introducing CodeI/O (, a systematic way to condense diverse reasoning patterns via code input-out….
0
54
0
RT @rui4research: 😆Excited to share our latest work on LLM Pruning🔥. 🚀Surpass llama-3.2-1B in MMLU with 1000x less cost.✅Enable flexible mo….
0
8
0
Very interesting work. We have an early exploration that was quite relevant. The project was called DetGPT, which also uses an agentic workflow to enable reasoning based detection: At that time I did not realize that was a type of agentic workflow.
Introducing Agentic Object Detection!. Given a text prompt like “unripe strawberries” or “Kellogg’s branded cereal” and an image, we use an agentic workflow to reason at length and detect the specified objects. No need to label any training data. Watch the video for details.
0
1
6
RT @li_chengzu: Forget just thinking in words. 🚀 New Era of Multimodal Reasoning🚨.🔍 Imagine While Reasoning in Space with MVoT. Multimodal….
0
168
0
RT @qiushi_sun: 🎉Introducing our latest work on GUI Agents: "OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synt….
0
42
0
RT @HaokunLin: 🔥 Welcome everyone to our Oral Presentation 'DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized….
0
9
0