mkirchhof_ Profile Banner
Michael Kirchhof Profile
Michael Kirchhof

@mkirchhof_

Followers
2K
Following
1K
Media
54
Statuses
353

Research Scientist at @Apple for uncertainty quantification.

Paris
Joined June 2022
Don't wanna be here? Send us removal request.
@mkirchhof_
Michael Kirchhof
5 months
Can LLMs access and describe their own internal distributions? With my colleagues at Apple, I invite you to take a leap forward and make LLM uncertainty quantification what it can be. 📄 https://t.co/uhoCJfPdZK 💻 https://t.co/pQY1DfaKtS 🧵1/9
1
21
89
@mkirchhof_
Michael Kirchhof
5 days
Our research team is hiring PhD interns 🍏 Spend your next summer in Paris and explore the next frontiers of LLMs for uncertainty quantification, calibration, RL and post-training, and Bayesian experimental design. Details & Application ➡️
Tweet card summary image
jobs.apple.com
Apply for a Internship - Machine Learning Research on Uncertainty job at Apple. Read about the role and find out if it’s right for you.
4
55
328
@mkirchhof_
Michael Kirchhof
7 days
We rethink how and why LLMs are calibrated: Not just on token-level, but on answer-level 👇
@PreetumNakkiran
Preetum Nakkiran
9 days
LLMs are notorious for "hallucinating": producing confident-sounding answers that are entirely wrong. But with the right definitions, we can extract a semantic notion of "confidence" from LLMs, and this confidence turns out to be calibrated out-of-the-box in many settings (!)
0
1
19
@sineadwilliamso
Sinead Williamson
12 days
📢 We’re looking for a researcher in in cogsci, neuroscience, linguistics, or related disciplines to work with us at Apple Machine Learning Research! We're hiring for a one-year interdisciplinary AIML Resident to work on understanding reasoning and decision making in LLMs. 🧵
9
57
308
@vishaal_urao
Vishaal Udandarao
20 days
🚀New Paper https://t.co/KB2hZljDHu We conduct a systematic data-centric study for speech-language pretraining, to improve end-to-end spoken-QA! 🎙️🤖 Using our data-centric insights, we pretrain a 3.8B SpeechLM (called SpeLangy) outperforming 3x larger models! 🧵👇
3
40
126
@mkirchhof_
Michael Kirchhof
20 days
ICLR reviewers: Be the change you want to see in the world. Don't reproduce violence :)
8
12
206
@alexttoshev
Alexander Toshev
26 days
If you are excited about Multimodal and Agentic Reasoning with Foundation Models, Apple ML Research has openings for Researchers, Engineers, and Interns in this area. Consider applying through the links below or feel free to send a message for more information. - Machine
Tweet card summary image
jobs.apple.com
Apply for a AIML - Machine Learning Researcher, MLR job at Apple. Read about the role and find out if it’s right for you.
12
54
460
@itsbautistam
Miguel Angel Bautista
27 days
🚀 Come work with me in the Machine Learning Research team at Apple! I’m looking for FT research scientists with a strong track of impactful publications on generative modeling (NeurIPS, ICML, ICLR, CVPR, ICCV, etc.) to join my team and work on fundamental generative modeling
Tweet card summary image
jobs.apple.com
Apply for a AIML - Machine Learning Researcher, MLR job at Apple. Read about the role and find out if it’s right for you.
7
41
345
@junting9
Junting Pan @ICCV 2025
1 month
The Foundation Model Team @🍎Apple AI/ML is looking for a Research Intern (flexible start date) to work on Multimodal LLMs and Vision-Language. Interested? DM me to learn more!
28
19
464
@HPouransari
Hadi Pouransari
1 month
📣We have PhD research internship positions available at Apple MLR. DM me your brief research background, resume, and availability (earliest start date and latest end date) if interested in the topics below.
@HPouransari
Hadi Pouransari
1 month
Introducing Pretraining with Hierarchical Memories: Separating Knowledge & Reasoning for On-Device LLM Deployment 💡We propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge). [1/X]🧵
8
48
460
@YizheZhangNLP
Yizhe Zhang
1 month
We use latent diffusion to model CoT as continuous semantics without committing to actual tokens. The thinking process is implicit and can be iteratively refined thanks to the diffusion, which allows for test time scaling.
@haoqik322
Murray Kang
1 month
🧵1/ Latent diffusion shines in image generation for its abstraction, iterative-refinement, and parallel exploration. Yet, applying it to text reasoning is hard — language is discrete. 💡 Our work LaDiR (Latent Diffusion Reasoner) makes it possible — using VAE + block-wise
0
8
27
@mkirchhof_
Michael Kirchhof
1 month
LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories. 📑 https://t.co/xTNn2rNTK5 Thread 👇
Tweet card summary image
arxiv.org
The impressive performance gains of modern language models currently rely on scaling parameters: larger models store more world knowledge and reason better. Yet compressing all world knowledge...
@HPouransari
Hadi Pouransari
1 month
Introducing Pretraining with Hierarchical Memories: Separating Knowledge & Reasoning for On-Device LLM Deployment 💡We propose dividing LLM parameters into 1) anchor (always used, capturing commonsense) and 2) memory bank (selected per query, capturing world knowledge). [1/X]🧵
0
21
174
@UnderGroundJeg
Huangjie Zheng
1 month
We’re excited to share our new paper: Continuously-Augmented Discrete Diffusion (CADD) — a simple yet effective way to bridge discrete and continuous diffusion models on discrete data, such as language modeling. [1/n] Paper: https://t.co/fQ8qxx4Pge
6
36
238
@alirezamh_
Alireza Mousavi
2 months
Super excited to share what @stephenz_y and I’ve been up to during our internship at🍎: Using optimal transport makes flows straighter and generation faster in flow matching, but small batch OT is biased and large batch OT is slow. What to do? Use semidiscrete OT! 🧵
3
39
268
@kschweig_
Kajetan Schweighofer
2 months
The story repeats itself. We find that xLSTM performs better than Transformers on moderate context lengths of e.g. 8k (see picture). However, xLSTM handles longer contexts better and the benefit over Transformers grows with context length - both for training and inference.
0
2
6
@mkirchhof_
Michael Kirchhof
2 months
But it does not seem impossible. Releasing this benchmark (+ code) to let you take a shot at this new avenue for uncertainty communication. This is a missing building block to enable agentic reasoning in uncertain environments, user trust, conformal calibration. Let’s solve it :)
0
1
1
@mkirchhof_
Michael Kirchhof
2 months
Second, we attempted hill-climbing along the benchmark. We already knew Reasoning and CoT can’t do it, now we’ve tried to explicitly SFT/DPO. Result: LLMs can get the format right, but what they output is not what they are actually uncertain about, information-theoretically.
1
0
1
@mkirchhof_
Michael Kirchhof
2 months
Since its initial release, we didn’t stop cooking: First, we continued validating whether the scores that the SelfReflect benchmarks assigns are robust signals of quality. Across more LLMs and datasets, it works. I have more confidence in the benchmark than ever.
1
0
0
@mkirchhof_
Michael Kirchhof
2 months
Many treat uncertainty = a number. At Apple, we're rethinking this: LLMs should output strings that reveal all information of their internal distributions. We find Reasoning, SFT, CoT can't do it -yet. To get there, we introduce the SelfReflect benchmark. https://t.co/uhoCJfPdZK
2
19
27
@XianhangLi
Xianhang Li
2 months
🤔 Ever thought a small teacher could train a student 6× larger that sets new SOTA in training efficiency and frozen evaluation performance for video representation learning? 🤔 Do we really need complex EMA-based self-distillation to prevent collapse, bringing unstable loss
8
77
461
@AggieInCA
Vimal Thilak🦉🐒
2 months
🚨 Machine Learning Research Internship opportunity in Apple MLR! We are looking for a PhD research intern with a strong interest in world modeling, planning or learning video representations for planning and/or reasoning. If interested, apply by sending an email to me at
4
35
324