Rohan Choudhury @rchoudhury997 X Profile

Rohan Choudhury

@rchoudhury997

Followers

493

Following

854

Media

6

Statuses

106

phd student at cmu https://t.co/pjU847PL2f

Pittsburgh, PA

Joined February 2022

Don't wanna be here? Send us removal request.

Rohan Choudhury

@rchoudhury997

9 months

Excited to finally release our NeurIPS 2024 (spotlight) paper! We introduce Run-Length Tokenization (RLT), a simple way to significantly speed up your vision transformer on video with no loss in performance!

22

169

1K

Rohan Choudhury

@rchoudhury997

3 days

this is a really fun idea for improving language models without having to curate more data from @lchen915 !.

Lili

@lchen915

4 days

Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. There is no external training data – the only input is a single prompt specifying the topic.

0

1

4

Grok

@grok

1 day

Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.

495

360

3K

Rohan Choudhury

@rchoudhury997

24 days

tough timing.

Ravid Shwartz Ziv

@ziv_ravid

24 days

So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!

0

3

Rohan Choudhury

@rchoudhury997

2 months

really interesting work at the intersection of vision and neuroscience!.

Jacob Yeung

@JacobYeung

2 months

1/6 🚀 Excited to share that BrainNRDS has been accepted as an oral at #CVPR2025!. We decode motion from fMRI activity and use it to generate realistic reconstructions of videos people watched, outperforming strong existing baselines like MindVideo and Stable Video Diffusion.🧠🎥

0

1

Rohan Choudhury

@rchoudhury997

3 months

RT @lchen915: One fundamental issue with RL – whether it’s for robots or LLMs – is how hard it is to get rewards. For LLM reasoning, we nee….

0

27

0

Rohan Choudhury

@rchoudhury997

6 months

really excited to be helping with this! Multimodality is still early - we need better benchmarks and evaluation methods to understand failure modes as well as possible.

Laszlo A Jeni (PhD)

@LaszloJeni

6 months

Paper submission is open! If you're eager to push the boundaries of multimodal AI, then BEAM 2025 (co-located with #CVPR2025 in Nashville, TN) is the event for you! Co-organized by @amazon & @SCSatCMU. #AI #MultimodalAI #MachineLearning #CallForPapers

0

3

Rohan Choudhury

@rchoudhury997

6 months

fly eagles fly.

0

3

Rohan Choudhury

@rchoudhury997

8 months

RT @jinkuncao: Our NeurIPS work Omnigrasp's poster session will be at this Friday!. 11 a.m. PST — 2 p.m. PSTEast Exhibit Hall A-C #4108. Fe….

0

4

0

Rohan Choudhury

@rchoudhury997

8 months

RT @rsalakhu: Carnegie Mellon University at NeurIPS 2024 – Machine Learning Blog | ML@CMU | Carnegie Mellon University. Carnegie Mellon Un….

blog.ml.cmu.edu

Carnegie Mellon University is proud to present 194 papers at the 38th conference on Neural Information Processing Systems (NeurIPS 2024), held from December 10-15 at the Vancouver Convention Center....

0

43

0

Rohan Choudhury

@rchoudhury997

9 months

😮‍💨😮‍💨😮‍💨😮‍💨.

Russell Mendonca

@mendonca_rl

9 months

These hands balance high controllability, reactivity and power, without blowing up design size. Very exciting !. In personal news, I recently completed my PhD from CMU, and have joined Optimus AI ! Would like to thank @pathak2206 for his guidance and mentorship.

0

Rohan Choudhury

@rchoudhury997

9 months

We’re excited to present this work at #neurips2024 !. This was joint with @IanZhu123, @SihanL_, Koichiro Niinuma and my advisors @kkitani and @LaszloJeni . Paper: Project Page: Code:

github.com

Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers". - rccchoudhury/rlt

0

6

73

Rohan Choudhury

@rchoudhury997

9 months

RLT leads to a 30%+ speedup in training and in inference (yes, it works on pre-trained transformers!) We observe virtually no degradation in performance, suggesting that transformers can effectively operate on this compressed version of the input.

1

35

Rohan Choudhury

@rchoudhury997

9 months

Because RLT doesn’t require running a model to identify repeated patches, we can figure out the new, variable length of the video before running the transformer, allowing us to seamlessly handle large batches with block-diagonal attention masks.

1

24

Rohan Choudhury

@rchoudhury997

9 months

This removes patches / tokens that are duplicated, which semantically corresponds to regions with no or very little motion over time.

3

2

44

Rohan Choudhury

@rchoudhury997

9 months

Our key idea: we identify temporally repeated patches frome the input, and remove these duplicates before running the model. Like run-length encoding, we add a new positional encoding to tell the transformer how long each “run” is.

3

1

54

Rohan Choudhury

@rchoudhury997

9 months

Vision transformers split input videos into equal sized patches - the number of tokens depends only on the number of frames and their resolution. But some videos are more complex than others - does lofi girl need same number of tokens as Obito vs Kakashi?

1

4

44

Rohan Choudhury

@rchoudhury997

9 months

I've been using Windsurf for the past two weeks, and it's probably at least 2xed my programming speed. I wrote my entire personal website with Cascade and maybe manually edited like 2 lines. Can't recommend this tool enough !.

Windsurf Current

@WindsurfCurrent

9 months

Today we’re excited to launch the Windsurf Editor - the first agentic IDE, and then some 🏄. In Windsurf, we have given the AI a previously unseen combination of deep codebase understanding, powerful set of tools, and real time awareness of your in-editor actions. The result? A

2

0

15

Rohan Choudhury

@rchoudhury997

10 months

more amazing work from @UksangYoo ! as always, a fantastic demo and fun paper name :).

Uksang Yoo

@UksangYoo

10 months

Can robots make pottery🍵? Throwing a pot is a complex manipulation task of continuously deforming clay. We will present RoPotter, a robot system that uses structural priors to learn from demonstrations and make pottery @HumanoidsConf @CMU_Robotics.👇1/8🧵

0

1

3

Rohan Choudhury

@rchoudhury997

10 months

really amazing work from @mihdalal . I think zero shot methods can potentially help scale robotics significantly better in the near future :).

Murtaza Dalal

@mihdalal

10 months

Can my robot cook my food, rearrange my dresser, tidy my messy table and do so much more without ANY demos or real-world training data?. Introducing ManipGen: A generalist agent for manipulation that can solve long-horizon robotics tasks entirely zero shot, from text input!.1/N

0

1

4

Rohan Choudhury

@rchoudhury997

11 months

RT @LaszloJeni: 🚀 At #ECCV2024? Explore zero-shot video QA on long videos at our ProViQ poster! We leverage procedural reasoning to master….

0

2

0