Jinghan Zhang
@jinghan23
Followers
118
Following
153
Media
5
Statuses
53
CSE PhD student @hkust in her second year advised by @junxian_he . Machine learning, NLP. bluesky here: https://t.co/ECxlKtKTxz
Hong Kong
Joined August 2022
Thrilled to introduce our latest work "Compression Represents Intelligence Linearly" 📜✨ on the intriguing relationship between compression efficiency and intelligence in large language models (LLMs) 🤖. Diving deep into 30 public LLMs from various organizations 🌍, we examined
5
4
21
Check out our new work on how models benefit from self-play and world modeling to explore better!
Want to get an LLM agent to succeed in an OOD environment? We tackle the hardest case with SPA (Self-Play Agent). No extra data, tools, or stronger models. Pure self-play. We first internalize a world model via Self-Play, then we learn how to win by RL. Like a child playing
0
0
1
🌟 Honored to share our #ICML25 paper as a co-first author! Excited to continue exploring model merging in cross-modality scenarios through the lens of interpretability.
Share our another #ICML25 paper: “Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging” ! (1/5) We use model merging to enhance VLMs' reasoning by integrating math-focused LLMs—bringing textual reasoning into multi-modal models. Surprisingly, this
0
0
6
Excited to share our upcoming ICML25 paper on understanding spatial reasoning difficulties in VLMs! A nice collaboration with an amazing team. Check out the great insights below from the lead authors! Thread here 🧵⬇️
🚀🔥 Thrilled to announce our ICML25 paper: "Why Is Spatial Reasoning Hard for VLMs? An Attention Mechanism Perspective on Focus Areas"! We dive into the core reasons behind spatial reasoning difficulties for Vision-Language Models from an attention mechanism view. 🌍🔍 Paper:
1
0
6
[Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘♀️0. Firstly, do not be nervous - Almost everything can be prepared in advance:) - Be grateful for everyone's time. - Think of it as an opportunity to share your research with others -- exciting, right? - Technical
14
76
508
🔔🎄Christmas Gift for Multimodal Reasoning: Introducing M-STaR 🎁 (1/6) How can we dive deeper to help Large Multimodal Models (LMMs) evolve into better reasoners? Announce M-STaR (Project Page: https://t.co/bvn5qatH1N): a self-evolving training framework for multimodal
3
38
97
Mitigating racial bias from LLMs is a lot easier than removing it from humans! Can’t believe this happened at the best AI conference @NeurIPSConf We have ethical reviews for authors, but missed it for invited speakers? 😡
181
803
4K
Arriving at #EMNLP2024🏝️Come check out our poster on November 12, 2024 at 11 AM in the Riverfront Hall! Would love to chat about NLP4Science🧬, drugs and proteins💊, and LLM agents and reasoning🤖 !
[1/4] RSA is accepted by #EMNLP2024 main track 🥳 - Enhance Any protein understanding model with lightning-fast retrieval. - 373x faster than MSA, on-the-fly computation, achieves comparable performance. Preprint link: https://t.co/DxE9IjqRfE Code: https://t.co/PD1ozjleMQ
0
4
20
In Philadelphia for #COLM2024! Excited to chat about long-context, multimodal, reasoning, and everything related to LMs! Come check out our work on Wednesday morning, session 5, # 17. Also open to visiting opportunities and 2025 summer internships anywhere in the world!
0
6
21
OpenAI Strawberry (o1) is out! We are finally seeing the paradigm of inference-time scaling popularized and deployed in production. As Sutton said in the Bitter Lesson, there're only 2 techniques that scale indefinitely with compute: learning & search. It's time to shift focus to
136
1K
6K
Thank you @AdapterHub for implementing our #NeurIPS method ( https://t.co/hW3Sn4IAVF) in your latest update! 🎉 Great to see our work being applied for practical advancements. Check out their work! #MachineLearning #AdapterMerging #ModelMerging
arxiv.org
As an efficient alternative to conventional full finetuning, parameter-efficient finetuning (PEFT) is becoming the prevailing method to adapt pretrained language models. In PEFT, a lightweight...
🎉Adapters 1.0 is here!🚀 Our open-source library for modular and parameter-efficient fine-tuning got a major upgrade! v1.0 is packed with new features (ReFT, Adapter Merging, QLoRA, ...), new models & improvements! Blog: https://t.co/Evp8kQG1je Highlights in the thread! 🧵👇
0
2
11
LLM Merging Competition🚨 very cool! Check out our work ( https://t.co/hW3Sn4IAVF) on parameter-efficient module merging for insights! We effectively perform detoxification via negation on Alpaca, based on Llama-7b, in the last experiment of this paper.
arxiv.org
As an efficient alternative to conventional full finetuning, parameter-efficient finetuning (PEFT) is becoming the prevailing method to adapt pretrained language models. In PEFT, a lightweight...
🚨 Model Merging competition @NeurIPSConf!🚀 Can you revolutionize model selection and merging?Let's create the best LLMs!🧠✨ 💻Come for science 💰Stay for $8K 💬Discord: https://t.co/eGgyBifqeq 🔗Sign up: https://t.co/afTxLA1jvi Sponsors: @huggingface @SakanaAILabs @arcee_ai
0
0
6
Interested in these scaling laws 🥳
📢 Excited to finally be releasing my NeurIPS 2024 submission! Is Chinchilla universal? No! We find that: 1. language model scaling laws depend on data complexity 2. gzip effectively predicts scaling properties from training data As compressibility 📉, data preference 📈. 🧵⬇️
0
0
2
Refer to our leaderboard for more details. https://t.co/t20oy6zSiV
Downstream scores can be noisy. If you wonder about Llama 3's compression perf in this figure, we have tested the BPC: Llama3 8B: 0.427, best at its size, comparable to Yi-34B Llama3 70B: 0.359, way ahead of all the models here Details at
0
0
1
Downstream scores can be noisy. If you wonder about Llama 3's compression perf in this figure, we have tested the BPC: Llama3 8B: 0.427, best at its size, comparable to Yi-34B Llama3 70B: 0.359, way ahead of all the models here Details at
github.com
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024] - hkust-nlp/llm-compression-intelligence
Compression Represents Intelligence Linearly LLMs' intelligence – reflected by average benchmark scores – almost linearly correlates with their ability to compress external text corpora repo: https://t.co/mncXEQFGaT abs: https://t.co/ZBzskGZZmZ
0
10
46