Saining Xie @sainingxie X Profile

Saining Xie

@sainingxie

Followers

21K

Following

5K

Media

57

Statuses

583

researcher in #deeplearning #computervision | assistant professor at @NYU_Courant @nyuniversity | previous: research scientist @metaai (FAIR) @UCSanDiego #YNWA

Joined July 2020

Don't wanna be here? Send us removal request.

Saining Xie

@sainingxie

6 days

yes.

vik

@vikhyatk

7 days

i mostly use my visual intelligence when trying to solve this. sota approaches to arc agi are mostly symbolic, vision doesn't really work well with today's models. ergo this is really because we haven't really solved visual reasoning AI.

1

2

29

Saining Xie

@sainingxie

7 days

The three biggest hps for stable training in everything are lr, bs, and beta2. We’ve built up good intuitions on how to tune them over time, but this lays it all out analytically and convincingly. this is definitely my new handbook for training big models on small gpus.

Micah Goldblum

@micahgoldblum

7 days

🚨 Did you know that small-batch vanilla SGD without momentum (i.e. the first optimizer you learn about in intro ML) is virtually as fast as AdamW for LLM pretraining on a per-FLOP basis? 📜 1/n

3

21

204

Saining Xie

@sainingxie

8 days

RT @_albertgu: I converted one of my favorite talks I've given over the past year into a blog post. "On the Tradeoffs of SSMs and Transfor….

0

113

0

Saining Xie

@sainingxie

10 days

RT @jonLorraine9: @sainingxie @joserf28323 @CVPR @ICCVConference @nyuniversity As the original poster of this strategy (sorry!), I agree th….

0

1

0

Saining Xie

@sainingxie

10 days

internet at its peak--just look at how people roasted him in quotes/comments 6 months ago. Again, I think this is very wrong, but can you really blame the students if the community was encouraging this idea, and then suddenly next day they’re being treated like the big villain?.

Ashwinee Panda

@PandaAshwinee

8 months

DO NOT DO THIS. I have previously raised this for Ethics Review when I saw it in a paper. You are not sneaky.

4

2

56

Saining Xie

@sainingxie

10 days

RT @AlexiGlad: How can we unlock generalized reasoning?. ⚡️Introducing Energy-Based Transformers (EBTs), an approach that out-scales (feed-….

0

244

0

Saining Xie

@sainingxie

10 days

Saining Xie

@sainingxie

10 days

@joserf28323 @CVPR @ICCVConference @nyuniversity Thanks for bringing this to my attention. I honestly wasn’t aware of the situation until the recent posts started going viral. I would never encourage my students to do anything like this—if I were serving as an Area Chair, any paper with this kind of prompt would be.

3

1

48

Saining Xie

@sainingxie

12 days

RT @sedielem: This looks like a great deep dive on neural network architectures for diffusion models. tl;dr use a Transformer, but there's….

0

10

0

Saining Xie

@sainingxie

17 days

RT @ManlingLi_: Can VLMs build Spatial Mental Models like humans?. Reasoning from limited views?.Reasoning from partial observations?.Reaso….

0

56

0

Saining Xie

@sainingxie

17 days

awesome work by @jiacheng_chen_ and @sanghyunwoo1219 on 3D-grounded visual compositing (and nice demos!).

Sanghyun Woo

@sanghyunwoo1219

17 days

Introducing BlenderFusion: Reassemble your visual elements—objects, camera, and background—to compose a new visual narrative. Play the interactive demo:

4

9

56

Saining Xie

@sainingxie

20 days

metaquery is now open-source — with both the data and code available.

Xichen Pan

@xichen_pan

20 days

The code and instruction-tuning data for MetaQuery are now open-sourced!.Code: Data: Two months ago, we released MetaQuery, a minimal training recipe for SOTA unified understanding and generation models. We showed that tuning few.

2

7

56

Saining Xie

@sainingxie

20 days

RT @karpathy: Do people *feel* how much work there is still to do. Like wow.

0

70

0

Saining Xie

@sainingxie

26 days

RT @tallinzen: I'm hiring at least one post-doc! We're interested in creating language models that process language more like humans than m….

0

52

0

Saining Xie

@sainingxie

27 days

guys, real geospatial data is a total goldmine for digital agents. step away from the web browser and get real. (we explored a bit in but building a simulation-ready pipeline like this could take things way further).

Chuang Gan

@gan_chuang

27 days

Virtual Community provides an online pipeline that automatically generates 3D scenes from real geospatial data, performing comprehensive cleaning and enhancement of both geometry and texture — including mesh simplification, texture refinement, object placement, and automatic

4

18

103

Saining Xie

@sainingxie

29 days

wait, speaking of false dichotomies---during your phd, you *can* write code, dive into data and systems, collaborate with a team, and build useful things---all while enjoying complete openness and the freedom to pursue what *genuinely* excites you.

Suvansh Sanjeev

@SuvanshSanjeev

30 days

i left my phd before joining openai. working in industry demands more rigor – you don’t just need to convince reviewer 2 with a nice graph and an ego-cite, it better actually work if it’s underwriting billions in research investment. not saying it always pans out that way in.

11

10

298

Saining Xie

@sainingxie

29 days

RT @mathusmassias: New paper on the generalization of Flow Matching 🤯 Why does flow matching generalize? Did you k….

0

232

0

Saining Xie

@sainingxie

29 days

RT @FeuerBenjamin: So excited to announce the DCVLR (Data Curation for Vision-Language Reasoning) competition at NeurIPS 2025, led by @Oumi….

0

11

0

Saining Xie

@sainingxie

30 days

RT @rohanpaul_ai: This is really BAD news of LLM's coding skill. ☹️. The best Frontier LLM models achieve 0% on hard real-life Programming….

0

317

0

Saining Xie

@sainingxie

30 days

RT @deedydas: LLMs are far worse at competitive programming than we thought. Every one scored 0% on Hard problems. LiveCodeBench-Pro is a….

0

218

0