Sima Noorani ✈️ NeurIPS Profile
Sima Noorani ✈️ NeurIPS

@NooraniSimaa

Followers
122
Following
119
Media
10
Statuses
27

PhD candidate @Penn

Philadelphia, PA
Joined March 2024
Don't wanna be here? Send us removal request.
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
7 days
We’re presenting our work today at Poster Session 2 (4:30 PM, #3303). Come check it out and chat with us! :) This is a joint work with @ShayanKiyani1 @HamedSHassani @pappasg69.
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
How can we quantify uncertainty in LLMs from only a few sampled outputs? The key lies in the classical problem of missing mass—the probability of unseen outputs. This perspective offers a principled foundation for conformal prediction in query-only settings like LLMs.
1
3
29
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
Paper : https://t.co/4z1zUyn0c2 Joint work with the amazing @ShayanKiyani1 , @pappasg69, @HamedSHassani
1
0
7
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
Empirically, the resulting collaborative sets improve over both human-only and AI-only sets in both marginal coverage and average set size.
1
0
2
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
We provide finite-sample algorithms in both offline (calibration-based) and online settings, with guarantees that hold under arbitrary distribution shifts ( including the natural case of human adaptation to the AI over time ) .
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
We show that optimal solution to this problem admits a simple two-threshold structure over a single nonconformity score: a pruning threshold that decides which labels in H(x) to remove, and an augmentation threshold that decides which new AI-suggested labels to add.
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
These two principles lead to an explicit optimization problem over collaborative sets C, a trade off between avoiding counterfactual harm, promoting complementarity, and keeping prediction sets informative (not too large)
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
We formalize the two guiding principles as follows: Counterfactual harm: collaboration should not make the human worse: P(Y∉ C(X) ∣ Y∈H(X)). Complementarity: the AI should recover labels the human misses: P(Y∈ C(X) ∣ Y∉H(X))
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
We formalize a simple collaborative prediction setting. Let (X,Y)∼P(X, Y) , with features X and label Y. A human expert first proposes a set H(x)⊆Y of plausible outcomes. The AI system then refines this proposal by outputting a collaborative set C(x) ⊆Y.
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
15 days
When humans and AI collaborate, what should uncertainty quantification look like? Our new paper proposes two principles---no counterfactual harm and complementarity---and gives distribution-free guarantees without assumptions on the task, AI model, or human behavior.
4
10
92
@Aaroth
Aaron Roth
1 month
How should you use forecasts f:X->R^d to make decisions? It depends what properties they have. If they are fully calibrated (E[y | f(x) = p] = p), then you should be maximally agressive and act as if they are correct --- i.e. play argmax_a E_{o ~ f(x)}[u(a,o)]. On the other hand
1
14
97
@ShayanKiyani1
Shayan Kiyani ✈️ NeurIPS
6 months
We push conformal prediction and its trade-offs beyond regression & classification — into query-based generative models. Surprisingly (or not?), missing mass & Good-Turing estimators emerge as key tools once again. Very excited about this one!
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
How can we quantify uncertainty in LLMs from only a few sampled outputs? The key lies in the classical problem of missing mass—the probability of unseen outputs. This perspective offers a principled foundation for conformal prediction in query-only settings like LLMs.
0
4
24
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
We show meaningful conformal prediction in query-only settings—like LLMs—arises from deep connections to the classical missing mass problem. Our principled framework balances key trade-offs: coverage, query cost, and informativeness in building prediction sets.
1
0
4
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
Across all datasets, CPQ achieves: ✔️tighter coverage ✔️Far lower EE usage
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
How does CPQ compare to state-of-the-art? We compare CPQ to CLM and SCOPE-Gen, two leading conformal methods for LLMs. But unlike CPQ, they: 📍 Don’t account for missing mass (i.e., unseen outputs) 📍 Don’t control the query budget explicitly
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
We evaluate CPQ on 3 LLM tasks with fixed query budgets and varying coverage. We show that: ✅ Adding each component improves performance ✅ Full CPQ has valid coverage with minimal EE inclusion ✅ CPQ adapts prediction set size principally to achieve compact, informative sets
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
Set map: After querying, CPQ builds optimal prediction sets relying on an estimate of the missing mass itself—using the classical Good-Turing estimator.
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
Query policy : Querying reduces uncertainty—but when is it enough? 🧐 The optimal strategy is to stop when the missing mass stops decreasing meaningfully! In finite-sample, this relies on our novel estimator for the missing mass derivative!
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
CPQ is built on two core components: 📍 A querying policy (how long to sample) 📍 A set-map (how to turn samples into a valid, informative set) And the optimal solution for each module are rooted in the classical problem of missing mass in statistics
1
0
1
@NooraniSimaa
Sima Noorani ✈️ NeurIPS
6 months
There’s a key trade-off in the query-only setting: More queries improve coverage and reduce reliance on EE—but cost more compute. Fewer queries save resources but increase the need for EE. Our framework CPQ balances coverage, query cost, and informativeness under a fixed budget
1
0
1