KaiserWhoLearns Profile Banner
Kaiser Sun Profile
Kaiser Sun

@KaiserWhoLearns

Followers
1K
Following
2K
Media
40
Statuses
348

Ph.D. student at @jhuclsp, human LM that hallucinates. Formerly @MetaAI, @uwnlp, and @AWS they/them🏳️‍🌈. #NLProc

My fantasea
Joined May 2021
Don't wanna be here? Send us removal request.
@KaiserWhoLearns
Kaiser Sun
28 days
What happens when an LLM is asked to use information that contradicts its knowledge? We explore knowledge conflict in a new preprint📑.TLDR: Performance drops, and this could affect the overall performance of LLMs in model-based evaluation.📑🧵⬇️ 1/8.#NLProc #LLM #AIResearch
Tweet media one
4
21
82
@KaiserWhoLearns
Kaiser Sun
5 days
Tokenization is most likely the reason whenever I had a bug in my model 🫠.
@_albertgu
Albert Gu
6 days
I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transformers".(or: tokens are bullshit). In a few days, we'll release what I believe is the next major advance for architectures.
Tweet media one
0
0
1
@KaiserWhoLearns
Kaiser Sun
9 days
RT @BafnaNiyati: 📢When LLMs solve tasks with a mid-to-low resource input/target language, their output quality is poor. We know that. But c….
0
9
0
@KaiserWhoLearns
Kaiser Sun
14 days
RT @ChengleiSi: Are AI scientists already better than human researchers?. We recruited 43 PhD students to spend 3 months executing research….
0
165
0
@KaiserWhoLearns
Kaiser Sun
19 days
RT @nouhadziri: 📢 Can LLMs really reason outside the box in math? Or are they just remixing familiar strategies? . Remember DeepSeek R1, o1….
0
158
0
@KaiserWhoLearns
Kaiser Sun
20 days
RT @chrome1996: Have you noticed….🔍 Aligned LLM generations feel less diverse?.🎯 Base models are decoding-sensitive?.🤔 Generations get more….
0
26
0
@KaiserWhoLearns
Kaiser Sun
28 days
RT @mdredze: Our new paper explores knowledge conflict in LLMs. It also issues a word of warning to those using LLMs as a Judge: the model….
0
9
0
@KaiserWhoLearns
Kaiser Sun
28 days
🛠️ Interested in how your LLM behaves under this circumstance? We released the code to generate the diagnostic data for your own LLM. @mdredze @loadingfan .8/8.
0
0
4
@KaiserWhoLearns
Kaiser Sun
28 days
🔗 Takeaways for practitioners.1. Check for knowledge conflict before prompting. 2. Add further explanation to guide the model in following the context. 3. Monitor hallucinations even when context is supplied. 7/8.
1
0
3
@KaiserWhoLearns
Kaiser Sun
28 days
📏 Implications:. ⚡When using an LLM as a judge, its parametric knowledge could lead to incorrect judgment :( . ⚡ Retrieval systems need mechanisms to detect and resolve contradictions, not just shove text into the prompt. 6/8.
1
0
3
@KaiserWhoLearns
Kaiser Sun
28 days
🧠 Key finding #3:. “Just give them more explanation?” Providing rationales helps—it pushes models to lean more on the context—but it still can’t fully silence the stubborn parametric knowledge. 5/8.
1
0
5
@KaiserWhoLearns
Kaiser Sun
28 days
⚖️ Key finding #2:. Unsurprisingly, LLMs prefer their own memories. Even when we explicitly instruct them to rely on the provided document, traces of the “wrong” internal belief keep leaking into answers. 4/8.
1
0
5
@KaiserWhoLearns
Kaiser Sun
28 days
⚠️ Key finding #1:. If the task doesn’t require external knowledge (e.g., pure copy), conflict barely matters. However, as soon as knowledge is needed, accuracy tanks when context and memory disagree. 3/8.
1
0
5
@KaiserWhoLearns
Kaiser Sun
28 days
🛠️ We create diagnostic data that….- Agrees/Contradicts with the model’s knowledge.- Contradictions with different levels of plausibility.- Tasks requiring different levels of knowledge .2/8
Tweet media one
Tweet media two
1
0
6
@KaiserWhoLearns
Kaiser Sun
28 days
👉 📑
1
0
6
@KaiserWhoLearns
Kaiser Sun
1 month
RT @BafnaNiyati: We know speech LID systems flunk on accented speech. But why? And what to do about it?🤔Our work (I….
0
7
0
@KaiserWhoLearns
Kaiser Sun
1 month
RT @tpimentelms: A string may get 17 times less probability if tokenised as two symbols (e.g., ⟨he, llo⟩) than as one (e.g., ⟨hello⟩)—by an….
0
24
0
@KaiserWhoLearns
Kaiser Sun
1 month
RT @alex_gill_nlp: 𝐖𝐡𝐚𝐭 𝐇𝐚𝐬 𝐁𝐞𝐞𝐧 𝐋𝐨𝐬𝐭 𝐖𝐢𝐭𝐡 𝐒𝐲𝐧𝐭𝐡𝐞𝐭𝐢𝐜 𝐄𝐯𝐚𝐥𝐮𝐚𝐭𝐢𝐨𝐧?. I'm happy to announce that the preprint release of my first project is on….
0
11
0
@KaiserWhoLearns
Kaiser Sun
1 month
RT @fangcong_y10593: Solving complex problems with CoT requires combining different skills. We can do this by:.🧩Modify the CoT data format….
0
32
0
@KaiserWhoLearns
Kaiser Sun
1 month
RT @krisgligoric: I'm excited to announce that I’ll be joining the Computer Science department at @JohnsHopkins as an Assistant Professor t….
0
183
0