wing.nus @wing_nus X Profile

wing.nus

@wing_nus

Followers

594

Following

400

Media

115

Statuses

528

Web IR / NLP Group at the National University of Singapore

https://t.co/aqMtH4rmeC

Singapore

Joined July 2012

Don't wanna be here? Send us removal request.

wing.nus

@wing_nus

14 days

📢 Excited to share our accepted EMNLP 2025 papers from the NUS WING group! 🎉 See you in Suzhou! #EMNLP2025

0

7

wing.nus

@wing_nus

15 days

Led by @yajing08042. Huge thanks to Prof Kan Min-Yen (@knmnyn) and Tony Deng for their support! See you at #EMNLP2025! 📜 Paper: https://t.co/7GZlWX5ZgH 💻 Code: https://t.co/xqtcA9pktd #NLP #DataToText #Finance #FinNLP #LLM #AI 🧵[5/n] n=5

github.com

This repository provides the official implementation for the EMNLP 2025 Findings paper: KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration for Financial Data Narration - yajingyang/kahan

0

wing.nus

@wing_nus

15 days

High-quality knowledge can be "distilled"! We used GPT-4o to generate a knowledge base for a smaller Llama3.1-8B. This "distillation" significantly boosted its performance, enabling efficient, high-quality narration. 🧵[4/n]

1

0

wing.nus

@wing_nus

15 days

Key finding: Hierarchy is critical. Our ablation study shows that narrative quality progressively increases as we add each level of analysis—from entity-only to the full KAHAN system. More structure = better insights. 🧵[3/n]

1

0

wing.nus

@wing_nus

15 days

KAHAN's 3-stage process: Entity Analysis: Asks domain-specific questions ("Nasdaq trend?") generates code for metrics. Insight Synthesis: Builds insights hierarchically from individual entity to system-wide. Narrative Generation: Turns the insights into a coherent report.🧵[2/n]

1

0

wing.nus

@wing_nus

15 days

Thrilled to share that our paper, "KAHAN: Knowledge-Augmented Hierarchical Analysis and Narration," is accepted at #EMNLP2025 Findings! In this work, we built a framework that uses LLMs as domain experts to hierarchically extract insights from tables. 🧵[1/n]

1

0

2

wing.nus

@wing_nus

22 days

@mki028 @AiBarid @knmnyn Welcome to our poster presentation at #EMNLP2025! We will present our poster at Hall C on Nov 7 at 12:30-13:30. See you there! 🧵[6/n] 📄 arXiv: https://t.co/wviGuF8ZQA ⌨️ Repo:

arxiv.org

Cross-lingual consistency should be considered to assess cross-lingual transferability, maintain the factuality of the model knowledge across languages, and preserve the parity of language model...

0

1

wing.nus

@wing_nus

22 days

Lastly, we also tried several methods to alleviate the inconsistency bottleneck. Among the other methods, we found that training objective that promotes cross-lingual alignment shows the best improvement and alleviates bottleneck. 🧵[5/n].

1

0

wing.nus

@wing_nus

22 days

We could see that larger model doesn’t give substantial consistency improved and we explored why this happened. So we examined the cross-lingual consistency across layer and we discovered that there is no monotonic improvement and this could possibly explain why. 🧵[4/n]

1

0

wing.nus

@wing_nus

22 days

We discovered that query whose language is distinct from the pivot language could elicit model to answer in different entity. This finding is substantially pronounced when the writing script is different than the pivot language. 🧵[3/n]

1

0

wing.nus

@wing_nus

22 days

We did the evaluation on code-switched sentence and we expect that by this setting, the model aligns the knowledge in more language-agnostic fashion. We limited scope to only consider English as the pivot language. 🧵[2/n]

1

0

wing.nus

@wing_nus

22 days

🚨New Paper at #EMNLP25 Findings If we ask a multilingual language model a factual question written on different languages, do the answers always refer to the same entity? Well..not quite. We dove deep into such issue in multilingual language models our work. 🧵[1/n]

1

0

2

wing.nus

@wing_nus

22 days

Key Takeaway ❌ Stop asking “Which is better: Transformer or SSM?” ✅ Start asking “How do they propagate information to optimize new architecture?” Check our paper: 🔗 Paper: https://t.co/UpjfOcBpRg 🧑‍💻 @NhatHoang2002, @dxlong2000, @CongDuyNguyen3, Luu Anh Tuan, @knmnyn

0

3

wing.nus

@wing_nus

22 days

🤔 Any theoretical proof for justification? ✍️ We formalize representation stability with mathematical bounds. 🔎 SSMs are provably more stable in propagation under practical conditions! This explains their resilience at deeper depths and longer contexts.

1

0

2

wing.nus

@wing_nus

22 days

🤔 Does the final layer still contain the most task-relevant representations? 🔎 We find that intermediate layers consistently outperform final layers across tasks, model scales, and context lengths, with Mamba showing the smallest drop to the final layer.

1

0

1

wing.nus

@wing_nus

22 days

🤔 What about layer-level global manifolds? 🔎 The overall patterns mirror the above token-level trends. Noticeably, a 32-layer Transformer-based model keep early/mid layers highly similar (e.g. 5th and 15th), suggesting minimal change compared to gradual evolution of SSMs.

1

0

2

wing.nus

@wing_nus

22 days

🤔 Do these behaviors arise from architectural biases or training dynamics? 🔎 Oversmoothing in Transformers is architectural bias; while in SSMs it is training-dependent!

1

0

2

wing.nus

@wing_nus

22 days

🤔 Do tokens keep their distinctiveness? 🔎 Oversmoothing reverses: Transformers homogenize early then recover late, SSMs preserve diversity early but homogenize deeper.

1

0

2

wing.nus

@wing_nus

22 days

🤔 How smoothly representations evolve? 🔎 Opposite trajectories: Transformers are stable early then shift late, while SSMs vary early then converge deep.

1

0

2

wing.nus

@wing_nus

22 days

🤔 Why do Transformers and Mamba (SSMs) fail differently on long context? 🔎 How do they mix and reshape context across depth? 🚀 No one had a unified, token + layer-level view — until now! 🔗 Paper: https://t.co/UpjfOcBpRg 🧵 👇 More in thread #Transformers #Mamba #NLP

1

3

6