Zineng Tang Profile
Zineng Tang

@ZinengTang

Followers
1K
Following
141
Media
24
Statuses
131

PhD in @Berkeley_ai and @BerkeleyNLP. Previously @UNCNLP and @MSFTResearch.

Chapel Hill, NC
Joined February 2019
Don't wanna be here? Send us removal request.
@ZinengTang
Zineng Tang
1 month
Excited to share our new work!.DOVE 🕊️: a dynamic vision encoder that adapts token count to image complexity. Fewer tokens, same fidelity—outperforming fixed-length AEs tokenizer on classification & VLM tasks!.Arxiv: Web: #AI #CV
Tweet media one
1
21
113
@ZinengTang
Zineng Tang
1 month
RT @ZinengTang: Excited to share our new work!.DOVE 🕊️: a dynamic vision encoder that adapts token count to image complexity. Fewer tokens,….
0
21
0
@ZinengTang
Zineng Tang
1 month
Big thanks to my undergrad intern Lingjun for delivering such impressive work, and to Rudy for the thoughtful co-advising!.
0
0
2
@ZinengTang
Zineng Tang
1 month
🔥 DOVE uses 68 % fewer tokens but has better FID than VQGAN/TiTok and +10–12 pts on VQA/ImageNet/CIFAR. It achieves significantly stronger performance on classification, probing, and VLM tasks. DOVE also brings emerging properties—PCA heatmaps reveal sharper segmentation.
Tweet media one
Tweet media two
Tweet media three
1
0
2
@ZinengTang
Zineng Tang
4 months
Also thanks to, @LongTonyLian, Seun Eisape (, @XDWang101, @roeiherzig, @Yalatweets, @alsuhr, and @trevordarrell for their great efforts!.
1
0
5
@ZinengTang
Zineng Tang
4 months
TULIP achieves state-of-the-art performance across multiple vision and vision-language benchmarks. It significantly improves zero-shot classification on ImageNet-1K, enhances fine-grained object recognition, and boosts multimodal reasoning scores. Compared to existing methods,
Tweet media one
1
1
5
@ZinengTang
Zineng Tang
4 months
Despite the success of contrastive image-text models like CLIP, they struggle with vision-centric tasks requiring high-fidelity understanding. We introduce TULIP, a novel model integrating generative data augmentation, enhanced contrastive learning, and reconstruction.
1
0
7
@ZinengTang
Zineng Tang
4 months
We are thrilled to announce TULIP!. 🌷 A state of the vision language encoders coupled with generative model for stronger representation learning.
Tweet media one
7
68
302
@ZinengTang
Zineng Tang
10 months
RT @ChengleiSi: Automating AI research is exciting! But can LLMs actually produce novel, expert-level research ideas?. After a year-long st….
0
770
0
@ZinengTang
Zineng Tang
11 months
RT @yzy_ai: We announced Phi 3.5 series today! .1️⃣ Multilingual Mini 3.8B: 2️⃣ MoE 16x3.8B (active 6.6B): https://….
0
6
0
@ZinengTang
Zineng Tang
1 year
CoDi-2 is selected as #CVPR2024 Highlight. Come joint us in today’s poster session Arch 4A-E #314 in 5pm to 6:30 pm!. @yzy_ai @nlpyang @ChenguangZhu2 @mohitban47.
@ZinengTang
Zineng Tang
2 years
🔥Excited to introduce CoDi-2!. It follows complex multimodal-interleaved in-context instructions to generate any modalities (text, vision, audio) in zero/few-shot interactive way!. @yzy_ai @nlpyang @ChenguangZhu2 @mohitban47. 🧵👇
0
10
18
@ZinengTang
Zineng Tang
1 year
RT @mohitban47: Honored and grateful to be named as one of the @UNC permanent distinguished professorships 🙏 . 100% of the credit goes to m….
0
24
0
@ZinengTang
Zineng Tang
1 year
Excited to share that CoDi-2 is accepted to @CVPR .In this work, we show that alignment of multimodal inputs to language unlocks ICL and Few-shot prompting ability for multimodal generation. #CVPR2024. @yzy_ai @nlpyang @ChenguangZhu2 @mohitban47.
@ZinengTang
Zineng Tang
2 years
🔥Excited to introduce CoDi-2!. It follows complex multimodal-interleaved in-context instructions to generate any modalities (text, vision, audio) in zero/few-shot interactive way!. @yzy_ai @nlpyang @ChenguangZhu2 @mohitban47. 🧵👇
2
10
54
@ZinengTang
Zineng Tang
2 years
RT @mohitban47: Had a great time speaking at @indoml_sym & meeting students+faculty+researchers (overall, kudos to organizers+volunteers on….
0
11
0
@ZinengTang
Zineng Tang
2 years
RT @pan_jiayipan: Generative model that operates across versatile modalities is where future is, which makes this work super interesting.
0
5
0
@ZinengTang
Zineng Tang
2 years
RT @mohitban47: 🚨 CoDi-2: In-Context, Interleaved, Interactive Any-to-Any Generation🚨. Interactive MLLM that can follow "multimodal-interle….
0
14
0
@ZinengTang
Zineng Tang
2 years
RT @jmin__cho: CoDi-2 presents multimodal 'generation' with interleaved in-context learning!. Another exciting work by @ZinengTang @yzy_ai….
0
8
0
@ZinengTang
Zineng Tang
2 years
RT @yzy_ai: Recall CoDi any-to-any generation? Now CoDi-2🔥any-to-any MLLM is here! It is.(1) 1st ever vision-language-audio zero/few-shot g….
0
6
0
@ZinengTang
Zineng Tang
2 years
RT @DrJimFan: Codi-2: an interleaved, multimodal LLM that supports arbitrary audio, image, and video mixture in I/O. You can give instructi….
0
130
0