Sima Noorani ✈️ NeurIPS @NooraniSimaa X Profile

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

Followers

122

Following

119

Media

10

Statuses

27

PhD candidate @Penn

Philadelphia, PA

Joined March 2024

Don't wanna be here? Send us removal request.

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

7 days

We’re presenting our work today at Poster Session 2 (4:30 PM, #3303). Come check it out and chat with us! :) This is a joint work with @ShayanKiyani1 @HamedSHassani @pappasg69.

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

How can we quantify uncertainty in LLMs from only a few sampled outputs? The key lies in the classical problem of missing mass—the probability of unseen outputs. This perspective offers a principled foundation for conformal prediction in query-only settings like LLMs.

1

3

29

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

Paper : https://t.co/4z1zUyn0c2 Joint work with the amazing @ShayanKiyani1 , @pappasg69, @HamedSHassani

1

0

7

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

Empirically, the resulting collaborative sets improve over both human-only and AI-only sets in both marginal coverage and average set size.

1

0

2

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

We provide finite-sample algorithms in both offline (calibration-based) and online settings, with guarantees that hold under arbitrary distribution shifts ( including the natural case of human adaptation to the AI over time ) .

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

We show that optimal solution to this problem admits a simple two-threshold structure over a single nonconformity score: a pruning threshold that decides which labels in H(x) to remove, and an augmentation threshold that decides which new AI-suggested labels to add.

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

These two principles lead to an explicit optimization problem over collaborative sets C, a trade off between avoiding counterfactual harm, promoting complementarity, and keeping prediction sets informative (not too large)

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

We formalize the two guiding principles as follows: Counterfactual harm: collaboration should not make the human worse: P(Y∉ C(X) ∣ Y∈H(X)). Complementarity: the AI should recover labels the human misses: P(Y∈ C(X) ∣ Y∉H(X))

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

We formalize a simple collaborative prediction setting. Let (X,Y)∼P(X, Y) , with features X and label Y. A human expert first proposes a set H(x)⊆Y of plausible outcomes. The AI system then refines this proposal by outputting a collaborative set C(x) ⊆Y.

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

15 days

When humans and AI collaborate, what should uncertainty quantification look like? Our new paper proposes two principles---no counterfactual harm and complementarity---and gives distribution-free guarantees without assumptions on the task, AI model, or human behavior.

4

10

92

Aaron Roth

@Aaroth

1 month

How should you use forecasts f:X->R^d to make decisions? It depends what properties they have. If they are fully calibrated (E[y | f(x) = p] = p), then you should be maximally agressive and act as if they are correct --- i.e. play argmax_a E_{o ~ f(x)}[u(a,o)]. On the other hand

1

14

97

Shayan Kiyani ✈️ NeurIPS

@ShayanKiyani1

6 months

We push conformal prediction and its trade-offs beyond regression & classification — into query-based generative models. Surprisingly (or not?), missing mass & Good-Turing estimators emerge as key tools once again. Very excited about this one!

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

How can we quantify uncertainty in LLMs from only a few sampled outputs? The key lies in the classical problem of missing mass—the probability of unseen outputs. This perspective offers a principled foundation for conformal prediction in query-only settings like LLMs.

0

4

24

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

Paper: https://t.co/cv0lD18MNK github: https://t.co/63bQfcBKPw Many many thanks to my amazing collaborators @ShayanKiyani1 @pappasg69 @HamedSHassani

github.com

Official repository for Conformal Prediction Beyond the Seen: A Missing Mass Perspective for Uncertainty Quantification in Generative Models - nooranisima/CPQ-missing-mass

0

4

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

We show meaningful conformal prediction in query-only settings—like LLMs—arises from deep connections to the classical missing mass problem. Our principled framework balances key trade-offs: coverage, query cost, and informativeness in building prediction sets.

1

0

4

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

Across all datasets, CPQ achieves: ✔️tighter coverage ✔️Far lower EE usage

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

How does CPQ compare to state-of-the-art? We compare CPQ to CLM and SCOPE-Gen, two leading conformal methods for LLMs. But unlike CPQ, they: 📍 Don’t account for missing mass (i.e., unseen outputs) 📍 Don’t control the query budget explicitly

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

We evaluate CPQ on 3 LLM tasks with fixed query budgets and varying coverage. We show that: ✅ Adding each component improves performance ✅ Full CPQ has valid coverage with minimal EE inclusion ✅ CPQ adapts prediction set size principally to achieve compact, informative sets

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

Set map: After querying, CPQ builds optimal prediction sets relying on an estimate of the missing mass itself—using the classical Good-Turing estimator.

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

Query policy : Querying reduces uncertainty—but when is it enough? 🧐 The optimal strategy is to stop when the missing mass stops decreasing meaningfully! In finite-sample, this relies on our novel estimator for the missing mass derivative!

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

CPQ is built on two core components: 📍 A querying policy (how long to sample) 📍 A set-map (how to turn samples into a valid, informative set) And the optimal solution for each module are rooted in the classical problem of missing mass in statistics

1

0

1

Sima Noorani ✈️ NeurIPS

@NooraniSimaa

6 months

There’s a key trade-off in the query-only setting: More queries improve coverage and reduce reliance on EE—but cost more compute. Fewer queries save resources but increase the need for EE. Our framework CPQ balances coverage, query cost, and informativeness under a fixed budget

1

0

1