Brando Miranda @ ICML 2025 @BrandoHablando X Profile

Brando Miranda @ ICML 2025

@BrandoHablando

Followers

1K

Following

2K

Media

87

Statuses

1K

CS Ph.D. @Stanford, researching data quality, foundation models, and ML for Theorem Proving. Prev: @MIT, @MIT_CBMM, @IllinoisCS, @IBM. Opinions are mine. 🇲🇽

Stanford, CA

Joined March 2013

Don't wanna be here? Send us removal request.

Brando Miranda @ ICML 2025

@BrandoHablando

3 days

🚨 Can your LLM really do math—or is it cramming the test set?. 📢 Meet Putnam-AXIOM, a advanced mathematics contamination-resilient benchmark that finally hurts FMs. 1. 2. #ICML2025 East Exhibition Hall A-B, #E-2502. 🧵1/14

3

16

56

Brando Miranda @ ICML 2025

@BrandoHablando

4 hours

Ok, now that ICML is officially over. I guess it's time to delete my twitter app again and lock in to deep life/work? :)💪🧠.

0

4

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

RT @allenainie: Come to EXAIT today for our best paper talk at 9:05 and poster session 11:45-2-15! The workshop also has a great set of tal….

0

5

0

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

RT @pli_cachete: Terence Tao on the supposed Gold from OpenAI at IMO

0

498

0

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

@arimorcos this is the workshop paper were we show your takeaways from ICML that target distribution is "all you need" (sorry can't remember how you actually said it! Didn't mean to be cheesy). #ICML2025.

0

2

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

Sorry I meant this picture! :D. Not the same one twice, X doesn't let me update my post.

0

1

4

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

Come to my second poster session about Data centric Machine Learning (DMLR)! . At 209-2010!. #ICML2025

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

Come to 208-209 ICML data workshop and chat with me about how to use data optimally! Scale isn't everything! . Ask me how to use it beyond post-training ;). - Scale isn’t enough: LLM performance rises with training‑task alignment more than with data volume. - Robust Alignment

1

8

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

@_akhaliq @_alycialee @ObbadElyas @sanmikoyejo @stai_research Preprint on arxiv: 🧵4/3.

arxiv.org

Contrary to the conventional emphasis on dataset size, we explore the role of data alignment -- an often overlooked aspect of data quality -- in training capable Large Language Models (LLMs). To...

0

2

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

RT @_akhaliq: Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data….

0

12

0

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

work at/with @sanmikoyejo @stai_research !.

0

1

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

@_akhaliq @_alycialee Joint work with @ObbadElyas Mario Krrish Aryan @sanmikoyejo Me Sudarsan at @stai_research !. Thank you!. 🧵3/3.

1

2

3

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

We first demonstrated scale isn't enough in our Beyond Scale paper using the diversity coefficient! . thanks for featuring us @_akhaliq !. Work led by @_alycialee et al!. 🧵 2/3.

AK

@_akhaliq

2 years

Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data. paper page: Current trends to pre-train capable Large Language Models (LLMs) mostly focus on scaling of model and dataset size.

2

3

15

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

Come to 208-209 ICML data workshop and chat with me about how to use data optimally! Scale isn't everything! . Ask me how to use it beyond post-training ;). - Scale isn’t enough: LLM performance rises with training‑task alignment more than with data volume. - Robust Alignment

3

1

5

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

Come to Convention Center West room 208-209 2nd floor to learn about optimal data selection using compression like gzip! . tldr; you can learn much faster if you use gzip compression distances to select data given a task!. DM if you are interested or what to use the code!.

Elyas Obbad

@ObbadElyas

9 months

🚨 What’s the best way to select data for fine-tuning LLMs effectively?. 📢Introducing ZIP-FIT—a compression-based data selection framework that outperforms leading baselines, achieving up to 85% faster convergence in cross-entropy loss, and selects data up to 65% faster. 🧵1/8

0

4

7

Brando Miranda @ ICML 2025

@BrandoHablando

1 day

RT @allenainie: If you missed @wanqiao_xu’s presentation, here are some of our slides! (The workshop will post full slides later on their w….

0

18

0

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

RT @heeney_luke: Academia must be the only industry where extremely high-skilled PhD students spend much of their time doing low value work….

0

123

0

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

RT @ai4mathworkshop: It's happening today!.📍Location: West Ballroom C, Vancouver Convention Center.⌚️Time: 8:30 am - 6:00 pm.🎥 Livestream:….

0

11

0

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

@HenryJamesBosch Picture with Leni! Coauthor of Veribench and long time collaborator!. Main mind behind Pantograph! #TACAS2025

0

1

2

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

Another poster by @zhankezhou and @tmlrgroup that I believe couldn't come due to visa detals

0

1

3

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

CoDaPo awesome joint work with @zhankezhou and @tmlrgroup !

1

7

Brando Miranda @ ICML 2025

@BrandoHablando

2 days

🚶‍♀️come talk to me live!. Thanks to @HenryJamesBosch for helping set and for the advertisment! :)

1

3