
Yingshan Chang
@_Yingshan
Followers
224
Following
425
Media
6
Statuses
67
Joined September 2021
Can a Transformer count inductively? ▶️ Yes, but different schema for positional embeddings are required for different forms of counting. Can we treat counting as a primitive operation of Transformer computation? ▶️ No, because it requires a non-trivial computation budget and.
1
8
24
RT @anilkseth: 1/3 @geoffreyhinton once said that the future depends on some graduate student being suspicious of everything he says (via @….
0
108
0
RT @jimin__sun: Just landed in Miami to attend #EMNLP2024 🐊 . I’ll be presenting the poster of our “Tools fail” paper on Wednesday Nov 13th….
0
11
0
RT @SRDewan18: Our method breaks down the Mutual Information into the Redundancy (R), Synergy (S), and Uniqueness (U) of the conditioning t….
0
1
0
RT @SRDewan18: Diffusion models have advanced significantly, but how well do we understand their workings? How do textual tokens impact ou….
0
5
0
RT @SoYeonTiffMin: We introduce Situated Instruction Following (SIF), to appear in ECCV 2024! There is inherent underspecification in instr….
0
41
0
RT @aran_nayebi: I’m thrilled to be joining @CarnegieMellon’s Machine Learning Department (@mldcmu) as an Assistant Professor this Fall!….
0
43
0
RT @XiaochuangHan: ❓Are there any unique advantages of diffusion-based LMs over autoregressive LMs?.❓Can we scale and instruction-tune diff….
0
16
0
RT @PPezeshkpour: LLMs excel in math. Introducing a new benchmark, we observe: They struggle with creative and many-step questions (even wi….
0
5
0
RT @percyliang: We should call models like Llama 3, Mixtral, etc. “open-weight models”, not “open-source models”. For a model to be open-so….
0
297
0
RT @NaihaoDeng: So excited to see this fascinating work by my labmate Artem🤩. This is an inspiration for everyone who loves animal 🤩. https….
arxiv.org
Similar to humans, animals make extensive use of verbal and non-verbal forms of communication, including a large range of audio signals. In this paper, we address dog vocalizations and explore the...
0
1
0
RT @pratyushmaini: 1/We've nailed a framework to reliably detect if an LLM was trained on your dataset: LLM Dataset Inference. After over….
0
109
0
RT @fly51fly: [LG] How Far Can Transformers Reason? The Locality Barrier and Inductive Scratchpad.E Abbe, S Bengio, A Lotfi, C Sandon, O Sa….
0
13
0
RT @R_Graph_Gallery: Looking for the best color palette? 😔. Check the tool we just created with @joseph_barbier:. 🎨 2500+ palettes .🐍 Pytho….
0
178
0
RT @ecekt2: We are looking for more reviewers for the Cognitive Modeling and Computational Linguistics Workshop (CMCL @ ACL 2024). The dead….
cmclorg.github.io
The 2025 Cognitive Modeling and Computational Linguistics workshop.
0
11
0
RT @zicokolter: I'm thrilled to share that I will become the next Director of the Machine Learning Department at Carnegie Mellon. MLD is a….
cs.cmu.edu
Zico Kolter will head the Machine Learning Department, effective June 15.
0
79
0
RT @rohanpaul_ai: 📌 This paper investigates the dramatic breakdown of state-of-the-art LLMs' reasoning capabilities when confronted with a….
0
214
0
RT @NaihaoDeng: Excited to share that our paper "Tables as Texts or Images: Evaluating the Table Reasoning Ability of LLMs and MLLMs" has b….
0
5
0