Jiaang Li @jiaangli X Profile

Jiaang Li

@jiaangli

Followers

115

Following

197

Media

11

Statuses

56

Ph.D. student at @BelongieLab @AiCentreDK @ELLISforEurope @DIKU_Institut | Natural Language Processing - Computer Vision

https://t.co/DBtxDCf2Bp

Denmark

Joined November 2021

Don't wanna be here? Send us removal request.

Srishti

@_srishtiyadav

5 months

New Preprint 🎉: "Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory". We review recent works on culture in VLMs and argue for deeper grounding in cultural theory to enable more inclusive evaluations. Paper 🔗: https://t.co/9AoRHTFG58

1

10

88

Jiaang Li

@jiaangli

5 months

Great collaboration with @yfyuan775 @Wenyan62 @maliannejadi @daniel_hers, Anders Søgaard, @licwu @Wenxuan__Zhang @pliang279 @ydeng_dandy @SergeBelongie ! 🔗More here: Project Page: https://t.co/kKxjqX83Vs Code: https://t.co/wFB9N5gz92 Dataset:

huggingface.co

0

Jiaang Li

@jiaangli

5 months

📊Our experiments demonstrate that even lightweight VLMs, when augmented with culturally relevant retrievals, outperform their non-augmented counterparts and even surpass the next larger model tier, achieving at least a 3.2% improvement in cVQA and 6.2% in cIC.

1

0

Jiaang Li

@jiaangli

5 months

🛠Culture-Aware Contrastive Learning We propose Culture-aware Contrastive Learning, a supervised learning framework compatible with both CLIP and SigLIP architectures. Fine-tuning with CAC can help models better capture culturally significant content.

1

0

Jiaang Li

@jiaangli

5 months

📚 Dataset Construction RAVENEA integrates 1,800+ images, 2,000+ culture-related questions, 500+ human captions, and 10,000+ human-ranked Wikipedia documents to support two key tasks: 🎯Culture-focused Visual Question Answering (cVQA) 📝Culture-informed Image Captioning (cIC)

1

0

Jiaang Li

@jiaangli

5 months

🚀New Preprint Alert 🚀 Can Multimodal Retrieval Enhance Cultural Awareness in Vision-Language Models? Excited to introduce RAVENEA, a new benchmark aimed at evaluating cultural understanding in VLMs through RAG.

1

5

8

Wenxuan Zhang

@Wenxuan__Zhang

7 months

🙋🏻‍♂️ I'm actively recruiting for multiple fully-funded positions in my group at SUTD! Topic: NLP / LLM / Multimodal Openings: 1 x Postdoc: start date is flexible 1 x PhD: earliest batch is Spring 2026 2-3 x Visiting students: fully funded for 6-12 months. Open to Bachelor /

2

18

91

Yova Kementchedjhieva

@YovaKem_v2

7 months

Postdoc opening for research on Vision-Language Modeling at MBZUAI with me and @thamar_solorio. 1 year position with possibility for extension. Please share and get in touch if you're interested!

0

3

15

Wenxuan Zhang

@Wenxuan__Zhang

8 months

SeaLLMs-Audio: Large Audio-Language Models for Southeast Asia 🎧 📢 Excited to share SeaLLMs-Audio, the multimodal (audio) extension of the SeaLLMs family. Following the release of SeaLLMs-v3 last year, we've focused on expanding to audio/speech capabilities, addressing the

0

13

34

Yang Deng

@ydeng_dandy

7 months

I have an opening for fully-funded 6-month visiting PhD student at SMU. Time: August 2025 - March 2026 Eligibility: PhD students from universities in Europe, North/South America, South-East Asia. Topic: NLP/LLM Email me for more details if you are interested~

4

28

176

Chengzu Li

@li_chengzu

10 months

Forget just thinking in words. 🚀 New Era of Multimodal Reasoning🚨 🔍 Imagine While Reasoning in Space with MVoT Multimodal Visualization-of-Thought (MVoT) revolutionizes reasoning by generating visual "thoughts" that transform how AI thinks, reasons, and explains itself.

14

169

747

Jiaang Li

@jiaangli

1 year

I will present "Do Vision and Language Models Share Concepts? A Vector Space Alignment Study" in person at #EMNLP2024😋 Feel free to drop by! ⏰ Nov, 12th (Tue) 16:00, In-Person Oral Session C

1

7

52

Wenyan Li

@Wenyan62

1 year

🍗🍗I will present FoodieQA in person at #EMNLP2024😋😋 Looking forward to meeting old and new friends! Feel free to drop by! (and have some snacks) ⏰ Nov, 13th (Wed) 16:00, In-Person Poster Session E (Riverfront Hall) I'm also on the job market and would be happy to chat :)

2

12

63

Jiaang Li

@jiaangli

1 year

📢📢The article is now in print: https://t.co/s1ILCb77mK

direct.mit.edu

Abstract. Large-scale pretrained language models (LMs) are said to “lack the ability to connect utterances to the world” (Bender and Koller, 2020), because they do not have “mental models of the...

Jiaang Li

@jiaangli

1 year

Do Vision and Language Models Share Concepts? 🤔👀🧠 We present an empirical evaluation and find that language models partially converge towards representations isomorphic to those of vision models. 📝: https://t.co/y6UDT9O0MP 🧑‍💻: https://t.co/Tm6dVKxaY5 🧵(1/8)

0

2

17

Belongie Lab

@BelongieLab

1 year

We welcome applications for a PhD position in CV/ML (fine-grained analysis of multimodal data, 2D/3D generative models, misinformation detection, self-supervised learning) @DIKU_Institut @AiCentreDK Apply though the ELLIS portal 💻 Deadline 15-Nov-2024 🗓️

ELLIS

@ELLISforEurope

1 year

The #ELLISPhD application portal is now open! Apply to top #AI labs & supervisors in Europe with a single application, and choose from different areas & tracks. The call for applications: https://t.co/og6xzr14tj Deadline: 15 November 2024 #PhD #PhDProgram #MachineLearning #ML

0

6

28

Wenyan Li

@Wenyan62

1 year

📢📢 Happy to share that our paper 🍜🍤"FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture" is accepted to EMNLP main! Dataset and code are all released! 🍗🍗🍗 https://t.co/zUraNFYOtN! https://t.co/g4x5kuKnCD Kudos to all my lovely coauthors!

huggingface.co

8

15

133

Sebastian Loeschcke

@SLoeschcke

1 year

LoQT accepted at @NeurIPSConf #NeurIPS 🎉 Thanks to @mabeto5p @SergeBelongie @mjkastoryano and @vesteinns for the collaboration! Links to preprint and code ⬇️

Sebastian Loeschcke

@SLoeschcke

1 year

Ever wanted to train your own 13B Llama2 model from scratch on a 24GB GPU? Or fine-tune one without compromising performance compared to full training? 🦙 You now can, with LoQT: Low Rank Adapters for Quantized Training! https://t.co/doCJybAxP1 1/4

2

4

24

Yifei Yuan

@yfyuan775

1 year

📢📣Happy to share our new benchmark paper: ‘Unlocking Markets: A Multilingual Benchmark to Cross-Market Question Answering’ accepted to #EMNLP main! Thanks to my amazing collaborators @ydeng_dandy, Anders Søgaard, @maliannejadi ❤️ Looking forward to presenting in Miami🏖️🏝️

2

10

64

Raphael Tang

@ralph_tang

1 year

Our paper on understanding variability in text-to-image models was accepted at #EMNLP2024 main track! Lots of thanks to my collaborators @crystina_z @yaolu_nlp @Wenyan62 @Ulienida and mentors @lintool Pontus @ferhanture. Check out

w1kp.com

Project page for the W1KP paper.

1

14

25

Wenyan Li

@Wenyan62

1 year

Will be presenting “understanding retirement robustness for retrieval-augmented image captioning” at: Poster In-Person session 2: Aug 12, 2pm Oral: Aug 13, multimodal session, 4:45pm Feel free to drop by👋 if you are interested!

1

12