Jiaqi Wang @wjqdev X Profile

Jiaqi Wang

@wjqdev

Followers

777

Following

75

Media

11

Statuses

51

Research Scientist at Shanghai AI Laboratory

Hong Kong

Joined June 2020

Don't wanna be here? Send us removal request.

Jiaqi Wang

@wjqdev

2 months

RT @HuggingPapers: Nvidia's got something new. UnifiedReward-Think is here: a multimodal CoT reward model for both visual understanding and….

0

41

0

Jiaqi Wang

@wjqdev

7 months

RT @ZhibingLi_6626: 🎉 Excited to introduce IDArb! 🎉. Our method can predict plausible and 𝗰𝗼𝗻𝘀𝗶𝘀𝘁𝗲𝗻𝘁 geometry and PBR material for 𝗮𝗻𝘆 𝗻𝘂𝗺….

0

25

0

Jiaqi Wang

@wjqdev

7 months

Thanks so much for tweeting our work!.

AK

@_akhaliq

7 months

InternLM-XComposer2.5-OmniLive. A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

0

1

20

Jiaqi Wang

@wjqdev

7 months

Many thanks for your tweeting 👏.

Adina Yakup

@AdinaYakup

7 months

InternLM-XComposer-2.5-OmniLive🔥 a specialized generalist multimodal system for streaming video and audio interactions by @intern_lm. Model: ✨ Apache 2.0, but a form is required for a commercial license.

0

4

Jiaqi Wang

@wjqdev

7 months

🚀 We’re excited to announce the release of InternLM-XComposer2.5-OmniLive (IXC2.5-OL), a comprehensive multimodal system designed for long-term streaming video and audio interactions. This fully open-sourced project delivers functionality similar to Gemini 2.0 Live Streaming and

3

35

129

Jiaqi Wang

@wjqdev

7 months

RT @liuziwei7: 😻Fine-Grained Visual Attributes for GenAI😻. #NeurIPS2024 🍎FiVA🍊 is a fine-grained visual attributes dataset and a framework….

0

36

0

Jiaqi Wang

@wjqdev

9 months

We have released SAM2Long, a training-free enhancement to SAM 2 for long-term video segmentation.🔥 Less error accumulation facing occlusion/reappearance. ⚡️ A training-free memory tree for dynamic segmentation paths, boosting resilience efficiently. 🤯 Significant improvements

1

39

167

Jiaqi Wang

@wjqdev

1 year

RT @LyxTg: 🚀Check out VideoVista, our comprehensive video-LMMs evaluation benchmark! We've assessed 33 video Video-LMMs across 27 tasks. Hi….

0

13

0

Jiaqi Wang

@wjqdev

1 year

RT @mayubo2333: Large Vision-Language Models (LVLMs) perform ideally on the understanding of single-page documents like DocVQA, ChartQA.….

0

16

0

Jiaqi Wang

@wjqdev

1 year

RT @Norod78: Gave InternLM-XComposer a go it recognized the activity correctly but could not count nor tell me what….

0

8

0

Jiaqi Wang

@wjqdev

1 year

RT @wjqdev: Thanks @iScienceLuvr for tweeting our work!.🚀 We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Langu….

0

7

0

Jiaqi Wang

@wjqdev

1 year

Thanks @_akhaliq for tweeting our work!.🚀 We have released InternLM-XComposer-2.5 (IXC-2.5) on Huggingface @huggingface, a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. 🌊 Support 24K interleaved image-text contexts. 🛠️ Versatile.

AK

@_akhaliq

1 year

InternLM-XComposer-2.5. A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image

1

18

75

Jiaqi Wang

@wjqdev

1 year

Thanks @iScienceLuvr for tweeting our work!.🚀 We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. 🌊 Support 24K interleaved image-text contexts.🛠️ Versatile applications:.- 🎞️ Video

Tanishq Abraham is at ICML

@iScienceLuvr

1 year

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. abs: model: New open vision-language model, has long-context capabilities, ultra-high resolution understanding,

0

7

33

Jiaqi Wang

@wjqdev

1 year

Thanks @arankomatsuzaki for tweeting our work!.🚀 We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. 🌊 Support 24K interleaved image-text contexts.🛠️ Versatile applications:.- 🎞️ Video.

Aran Komatsuzaki

@arankomatsuzaki

1 year

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. - Excels in various text-image tasks w/ GPT-4V level capabilities with merely 7B LLM backend.- Opensourced.

0

3

14