Weijie Su @weijie444 X Profile

Weijie Su

@weijie444

Followers

6K

Following

4K

Media

55

Statuses

868

Associate Professor @Wharton & CS Penn. coDir @Penn Research #MachineLearning. PhD @Stanford. #MachineLearng #DeepLearning #Statistics #Privacy #Optimization.

https://t.co/ijXfz6KVOz

Philadelphia, PA

Joined September 2011

Don't wanna be here? Send us removal request.

Weijie Su

@weijie444

4 days

Back from NeurIPS with one fun observation: Two different communities--optimization and deep learning theory--were both talking about Muon (aka spectral GD). https://t.co/YEG65gGLqG Lots of emerging (and sometimes contradictory) takes. I’ve got two perspectives sketched in the

Weijie Su

@weijie444

1 month

Why and how does gradient/matrix orthogonalization work in Muon for training #LLMs? We introduce an isotropic curvature model to explain it. Take-aways: 1. Orthogonalization is a good idea, "on the right track". 2. But it might not be optimal. [1/n]

8

24

224

Haifeng Xu

@haifengxu0

1 day

Looking forward to the talk and conversations!

Cooperative AI Foundation

@coop_ai

3 days

Don't miss our last seminar of the year: 'The Interplay of Economic Thinking and Language Models: Vignettes and Lessons', live 18th of December (5pm GMT, 9am PT, 12pm ET) led by @haifengxu0 (@UChicago). Link below.

0

1

9

Premium

@premium

4 months

Why guess when you can know?

0

711

8K

ICML Conference

@icmlconf

2 days

📢 Call for workshop proposals for #ICML2026 in Seoul! 🇰🇷 📆Deadline: February 13, 2026 New this year: - At most 8 organizers per workshop - Organizers must declare if they're an organizer on another workshop proposal - Stricter enforcement of the proposal page limit

3

21

137

Weijie Su

@weijie444

3 days

Here are the details https://t.co/udkWWKkSTx

0

Weijie Su

@weijie444

3 days

Our ICML 2026 Policy for LLM use in Reviewing

ICML Conference

@icmlconf

3 days

Announcing the ICML 2026 policy for LLMs in reviewing! Reviewers and authors both pick either conservative or permissive LLM use, and will be matched accordingly. Importantly: authors on papers who choose conservative must obey the conservative policy as reviewers.

3

18

PragerU

@prageru

2 months

How did illegal vapes from China flood our neighborhoods and schools?

41

268

808

Penn Engineering AI

@PennEngAI

5 days

Doctoral student @HeWeiqing86254 presented his @NeurIPSConf 2025 research on using statistical tests to help detect AI-generated text. This paper was co-authored by Weiqing He, Xiang Li & Tianqi Shang, along with Profs. @lishenlc, @weijie444 & @DrQiLong. https://t.co/pVUqk8URpJ

0

3

7

Weijie Su

@weijie444

10 days

Of course, citation analysis is tricky. There are many confounders: • Early arXiv visibility 🗓️ • Author fame 🌟 • "Hot" topics 🔥 We tried our best to control for these factors. However, given that top-ranked papers consistently receive ~2x the citations of lower-ranked

0

1

Weijie Su

@weijie444

10 days

Main results of *How to Find Fantastic AI Papers: Self-Rankings as a Powerful Predictor of Scientific Impact Beyond Peer Review* are shown in Figure 2: We grouped papers by how authors privately ranked them. Analysis based on our ICML 2023 ranking experiment

0

1

2

Weijie Su

@weijie444

10 days

Just landed in SD for #NeurIPS2025. With 5K accepted papers, how to find *Fantastic* AI papers? Solution: ask the authors to rank their own papers Results: Papers ranked #1 by authors received 2x more citations than those they ranked last Paper: https://t.co/oboZXfjdEC

5

8

65

Chainlink

@chainlink

11 days

Every financial market speaks a different language. But Chainlink understands them all.

0

5

Weijie Su

@weijie444

12 days

A bit of tech details: We want to maximize the weighted sum (\sum_t \text{freq}(t)\times |t|). This leads to viewing tokenization as a **graph-partitioning problem**, where characters form a weighted graph and merges correspond to partitions that maximize this objective.

0

1

12

Weijie Su

@weijie444

12 days

Here are the comparisons between the Length-MAX tokenizer and BPE:

2

1

9

Weijie Su

@weijie444

12 days

A new tokenizer is introduced for LLMs: https://t.co/Zuerv1jsZ4 Idea: Instead of merging tokens by frequency (BPE), optimize the tokenizer directly for maximizing average token length, yielding longer, more efficient tokens. Results: 14–18% fewer tokens, faster training &

15

68

454

Zhun Deng

@zhun_deng

13 days

Will present our recent work at NeurIPS with wonderful students and faculty @lihua_lei_stat on inference under data feedback loops https://t.co/f8ElEKSslE, characterizing the exact limit distribution of repeated training without non-asymptotic error compounding.

0

3

11

Den of Wolves

@denofwolvesgame

9 days

The plan was clean. The escape? Not so much. Midway City doesn’t play fair. Wishlist this 4-player coop heist FPS on Steam today.

10

15

294

Weijie Su

@weijie444

13 days

A special issue on stats and AI.

Statistical Learning and Data Science

@SLADS_Journal

27 days

🚀 Call for Papers: Special Issue on Statistics and AI Journal: Statistical Learning and Data Science (SLADS) 📅 Submission Deadline: March 31, 2026 👨‍🎨 Guest Editors: Xiaowu Dai (UCLA), Weijie Su (UPenn), Linglong Kong (UAlberta), Zhihua Zhang (PKU)

0

1

8

Weijie Su

@weijie444

14 days

Heading to SD for #NeurIPS2025 from Dec 4 to 7. Happy to meet and chat.

4

54

Weijie Su

@weijie444

16 days

You Are the Best Reviewer of Your Own Papers

Saining Xie

@sainingxie

17 days

it may seem like an ordinary day, but it could become the strangest moment in peer review and open science please please please treat our community with care. it’s already so fragile. don’t let it die.

0

3

37

Aaron Roth

@Aaroth

20 days

Honored to follow in the footsteps of so many other great researchers at Penn that I admire.

Penn Engineering

@PennEngineers

20 days

Congratulations to Aaron Roth (@Aaroth), the Henry Salvatori Professor of Computer & Cognitive Science (@cis_penn), for receiving the 2025-26 George H. Heilmeier Faculty Award for Excellence in Research. Roth has been recognized for his fundamental contributions "to formalizing,

12

5

131

Chevron

@Chevron

30 days

AI is reshaping energy demand, and we’re ready. Our Chairman & CEO Mike Wirth shares how we’re powering the future.

14

18

155

Jason Lee

@jasondeanlee

26 days

So a week ago, I was complaining gpt 5 doesn't write latex. Gemini 3 is much worse. Basically nothing renders

129

29

880

Yuxin Chen

@chenyx04

1 month

Two super talented student collaborators @yuhuang42 and @Zixin_Wen developed new theory for length-generalizable CoT reasoning!

Yu Huang

@yuhuang42

1 month

Excited to share our recent work! We provide a mechanistic understanding of long CoT reasoning in state-tracking: when do transformers length-generalize strongly, when they stall, and how recursive self-training pushes the boundary. 🧵(1/8)

0

2

33

Weijie Su

@weijie444

1 month

We're excited to announce the call for papers for #ICML 2026: https://t.co/RDT3zVZDYX See you in Seoul next summer!

ICML Conference

@icmlconf

1 month

🎉ICML 2026 Call for Papers (& Position Papers) has arrived!🎉 A few key changes this year: - Attendance for authors of accepted papers is optional - Originally submitted version of accepted papers will be made public - Cap on # of papers one can be reciprocal reviewer for ...

0

3

21