
Jiaqi Wang
@wjqdev
Followers
777
Following
75
Media
11
Statuses
51
Research Scientist at Shanghai AI Laboratory
Hong Kong
Joined June 2020
RT @HuggingPapers: Nvidia's got something new. UnifiedReward-Think is here: a multimodal CoT reward model for both visual understanding andโฆ.
0
41
0
RT @ZhibingLi_6626: ๐ Excited to introduce IDArb! ๐. Our method can predict plausible and ๐ฐ๐ผ๐ป๐๐ถ๐๐๐ฒ๐ป๐ geometry and PBR material for ๐ฎ๐ป๐ ๐ป๐๐บโฆ.
0
25
0
Many thanks for your tweeting ๐.
InternLM-XComposer-2.5-OmniLive๐ฅ a specialized generalist multimodal system for streaming video and audio interactions by @intern_lm. Model: โจ Apache 2.0, but a form is required for a commercial license.
0
0
4
RT @liuziwei7: ๐ปFine-Grained Visual Attributes for GenAI๐ป. #NeurIPS2024 ๐FiVA๐ is a fine-grained visual attributes dataset and a frameworkโฆ.
0
36
0
RT @mayubo2333: Large Vision-Language Models (LVLMs) perform ideally on the understanding of single-page documents like DocVQA, ChartQA.โฆ.
0
16
0
RT @wjqdev: Thanks @iScienceLuvr for tweeting our work!.๐ We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Languโฆ.
0
7
0
Thanks @_akhaliq for tweeting our work!.๐ We have released InternLM-XComposer-2.5 (IXC-2.5) on Huggingface @huggingface, a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. ๐ Support 24K interleaved image-text contexts. ๐ ๏ธ Versatile.
InternLM-XComposer-2.5. A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. We present InternLM-XComposer-2.5 (IXC-2.5), a versatile large-vision language model that supports long-contextual input and output. IXC-2.5 excels in various text-image
1
18
75
Thanks @iScienceLuvr for tweeting our work!.๐ We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. ๐ Support 24K interleaved image-text contexts.๐ ๏ธ Versatile applications:.- ๐๏ธ Video
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. abs: model: New open vision-language model, has long-context capabilities, ultra-high resolution understanding,
0
7
33
Thanks @arankomatsuzaki for tweeting our work!.๐ We have released InternLM-XComposer-2.5 (IXC-2.5), a versatile Large Vision Language Model (LVLM) supporting long-contextual input and output. ๐ Support 24K interleaved image-text contexts.๐ ๏ธ Versatile applications:.- ๐๏ธ Video.
InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output. - Excels in various text-image tasks w/ GPT-4V level capabilities with merely 7B LLM backend.- Opensourced.
0
3
14