Jinghan Zhang @jinghan23 X Profile

Jinghan Zhang

@jinghan23

Followers

118

Following

153

Media

5

Statuses

53

CSE PhD student @hkust in her second year advised by @junxian_he . Machine learning, NLP. bluesky here: https://t.co/ECxlKtKTxz

https://t.co/c1Kneajn7L

Hong Kong

Joined August 2022

Don't wanna be here? Send us removal request.

Jinghan Zhang

@jinghan23

2 years

Thrilled to introduce our latest work "Compression Represents Intelligence Linearly" 📜✨ on the intriguing relationship between compression efficiency and intelligence in large language models (LLMs) 🤖. Diving deep into 30 public LLMs from various organizations 🌍, we examined

5

4

21

Jinghan Zhang

@jinghan23

1 month

Check out our new work on how models benefit from self-play and world modeling to explore better!

Shiqi Chen

@shiqi_chen17

1 month

Want to get an LLM agent to succeed in an OOD environment? We tackle the hardest case with SPA (Self-Play Agent). No extra data, tools, or stronger models. Pure self-play. We first internalize a world model via Self-Play, then we learn how to win by RL. Like a child playing

0

1

Jinghan Zhang

@jinghan23

7 months

🌟 Honored to share our #ICML25 paper as a co-first author! Excited to continue exploring model merging in cross-modality scenarios through the lens of interpretability.

Shiqi Chen

@shiqi_chen17

7 months

Share our another #ICML25 paper: “Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging” ! (1/5) We use model merging to enhance VLMs' reasoning by integrating math-focused LLMs—bringing textual reasoning into multi-modal models. Surprisingly, this

0

6

Jinghan Zhang

@jinghan23

7 months

Excited to share our upcoming ICML25 paper on understanding spatial reasoning difficulties in VLMs! A nice collaboration with an amazing team. Check out the great insights below from the lead authors! Thread here 🧵⬇️

Shiqi Chen

@shiqi_chen17

7 months

🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:

1

0

6

Manling Li

@ManlingLi_

11 months

[Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘‍♀️0. Firstly, do not be nervous - Almost everything can be prepared in advance:) - Be grateful for everyone's time. - Think of it as an opportunity to share your research with others -- exciting, right? - Technical

14

76

508

Wei Liu

@WeiLiu99

11 months

🔔🎄Christmas Gift for Multimodal Reasoning: Introducing M-STaR 🎁 (1/6) How can we dive deeper to help Large Multimodal Models (LMMs) evolve into better reasoners? Announce M-STaR (Project Page: https://t.co/bvn5qatH1N): a self-evolving training framework for multimodal

3

38

97

Jiao Sun

@sunjiao123sun_

1 year

Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡

181

803

4K

chang ma

@ma_chang_nlp

1 year

Arriving at #EMNLP2024🏝️Come check out our poster on November 12, 2024 at 11 AM in the Riverfront Hall! Would love to chat about NLP4Science🧬, drugs and proteins💊, and LLM agents and reasoning🤖 !

chang ma

@ma_chang_nlp

1 year

[1/4] RSA is accepted by #EMNLP2024 main track 🥳 - Enhance Any protein understanding model with lightning-fast retrieval. - 373x faster than MSA, on-the-fly computation, achieves comparable performance. Preprint link: https://t.co/DxE9IjqRfE Code: https://t.co/PD1ozjleMQ

0

4

20

Yangqiu Song

@yqsong

1 year

Follow us @hkustNLP 😁

1

11

51

Jinghan Zhang

@jinghan23

1 year

In Philadelphia for #COLM2024! Excited to chat about long-context, multimodal, reasoning, and everything related to LMs! Come check out our work on Wednesday morning, session 5, # 17. Also open to visiting opportunities and 2025 summer internships anywhere in the world!

0

6

21

Jim Fan

@DrJimFan

1 year

OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to

136

1K

6K

Jinghan Zhang

@jinghan23

1 year

Thank you @AdapterHub for implementing our #NeurIPS method ( https://t.co/hW3Sn4IAVF) in your latest update! 🎉 Great to see our work being applied for practical advancements. Check out their work! #MachineLearning #AdapterMerging #ModelMerging

arxiv.org

As an efficient alternative to conventional full finetuning, parameter-efficient finetuning (PEFT) is becoming the prevailing method to adapt pretrained language models. In PEFT, a lightweight...

AdapterHub

@AdapterHub

1 year

🎉Adapters 1.0 is here!🚀 Our open-source library for modular and parameter-efficient fine-tuning got a major upgrade! v1.0 is packed with new features (ReFT, Adapter Merging, QLoRA, ...), new models & improvements! Blog: https://t.co/Evp8kQG1je Highlights in the thread! 🧵👇

0

2

11

Jinghan Zhang

@jinghan23

2 years

LLM Merging Competition🚨 very cool! Check out our work ( https://t.co/hW3Sn4IAVF) on parameter-efficient module merging for insights! We effectively perform detoxification via negation on Alpaca, based on Llama-7b, in the last experiment of this paper.

arxiv.org

As an efficient alternative to conventional full finetuning, parameter-efficient finetuning (PEFT) is becoming the prevailing method to adapt pretrained language models. In PEFT, a lightweight...

Leshem (Legend) Choshen 🤖🤗 @NeurIPS

@LChoshen

2 years

🚨 Model Merging competition @NeurIPSConf!🚀 Can you revolutionize model selection and merging?Let's create the best LLMs!🧠✨ 💻Come for science 💰Stay for $8K 💬Discord: https://t.co/eGgyBifqeq 🔗Sign up: https://t.co/afTxLA1jvi Sponsors: @huggingface @SakanaAILabs @arcee_ai

0

6

Jinghan Zhang

@jinghan23

2 years

Interested in these scaling laws 🥳

Rohan Pandey

@khoomeik

2 years

📢 Excited to finally be releasing my NeurIPS 2024 submission! Is Chinchilla universal? No! We find that: 1. language model scaling laws depend on data complexity 2. gzip effectively predicts scaling properties from training data As compressibility 📉, data preference 📈. 🧵⬇️

0

2

Jinghan Zhang

@jinghan23

2 years

Refer to our leaderboard for more details. https://t.co/t20oy6zSiV

Junxian He

@junxian_he

2 years

Downstream scores can be noisy. If you wonder about Llama 3's compression perf in this figure, we have tested the BPC: Llama3 8B: 0.427, best at its size, comparable to Yi-34B Llama3 70B: 0.359, way ahead of all the models here Details at

0

1

Junxian He

@junxian_he

2 years

Downstream scores can be noisy. If you wonder about Llama 3's compression perf in this figure, we have tested the BPC: Llama3 8B: 0.427, best at its size, comparable to Yi-34B Llama3 70B: 0.359, way ahead of all the models here Details at

github.com

Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024] - hkust-nlp/llm-compression-intelligence

Aran Komatsuzaki

@arankomatsuzaki

2 years

Compression Represents Intelligence Linearly LLMs' intelligence – reflected by average benchmark scores – almost linearly correlates with their ability to compress external text corpora repo: https://t.co/mncXEQFGaT abs: https://t.co/ZBzskGZZmZ

0

10

46

Jinghan Zhang

@jinghan23

2 years

It's amazing 🤩

AI at Meta

@AIatMeta

2 years

Llama 3 delivers a major leap over Llama 2 and demonstrates SOTA performance on a wide range of industry benchmarks. The models also achieve substantially reduced false refusal rates, improved alignment and increased diversity in model responses — in addition to improved

0

3