Rohan Choudhury Profile
Rohan Choudhury

@rchoudhury997

Followers
493
Following
854
Media
6
Statuses
106

phd student at cmu https://t.co/pjU847PL2f

Pittsburgh, PA
Joined February 2022
Don't wanna be here? Send us removal request.
@rchoudhury997
Rohan Choudhury
9 months
Excited to finally release our NeurIPS 2024 (spotlight) paper! We introduce Run-Length Tokenization (RLT), a simple way to significantly speed up your vision transformer on video with no loss in performance!
22
169
1K
@rchoudhury997
Rohan Choudhury
3 days
this is a really fun idea for improving language models without having to curate more data from @lchen915 !.
@lchen915
Lili
4 days
Self-Questioning Language Models: LLMs that learn to generate their own questions and answers via asymmetric self-play RL. There is no external training data – the only input is a single prompt specifying the topic.
Tweet media one
0
1
4
@grok
Grok
1 day
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
495
360
3K
@rchoudhury997
Rohan Choudhury
24 days
tough timing.
@ziv_ravid
Ravid Shwartz Ziv
24 days
So, all the models underperform humans on the new International Mathematical Olympiad questions, and Grok-4 is especially bad on it, even with best-of-n selection? Unbelievable!
Tweet media one
0
0
3
@rchoudhury997
Rohan Choudhury
2 months
really interesting work at the intersection of vision and neuroscience!.
@JacobYeung
Jacob Yeung
2 months
1/6 🚀 Excited to share that BrainNRDS has been accepted as an oral at #CVPR2025!. We decode motion from fMRI activity and use it to generate realistic reconstructions of videos people watched, outperforming strong existing baselines like MindVideo and Stable Video Diffusion.🧠🎥
0
0
1
@rchoudhury997
Rohan Choudhury
3 months
RT @lchen915: One fundamental issue with RL – whether it’s for robots or LLMs – is how hard it is to get rewards. For LLM reasoning, we nee….
0
27
0
@rchoudhury997
Rohan Choudhury
6 months
really excited to be helping with this! Multimodality is still early - we need better benchmarks and evaluation methods to understand failure modes as well as possible.
@LaszloJeni
Laszlo A Jeni (PhD)
6 months
Paper submission is open! If you're eager to push the boundaries of multimodal AI, then BEAM 2025 (co-located with #CVPR2025 in Nashville, TN) is the event for you! Co-organized by @amazon & @SCSatCMU. #AI #MultimodalAI #MachineLearning #CallForPapers
Tweet media one
0
0
3
@rchoudhury997
Rohan Choudhury
6 months
fly eagles fly.
0
0
3
@rchoudhury997
Rohan Choudhury
8 months
RT @jinkuncao: Our NeurIPS work Omnigrasp's poster session will be at this Friday!. 11 a.m. PST — 2 p.m. PSTEast Exhibit Hall A-C #4108. Fe….
0
4
0
@rchoudhury997
Rohan Choudhury
9 months
😮‍💨😮‍💨😮‍💨😮‍💨.
@mendonca_rl
Russell Mendonca
9 months
These hands balance high controllability, reactivity and power, without blowing up design size. Very exciting !. In personal news, I recently completed my PhD from CMU, and have joined Optimus AI ! Would like to thank @pathak2206 for his guidance and mentorship.
0
0
0
@rchoudhury997
Rohan Choudhury
9 months
We’re excited to present this work at #neurips2024 !. This was joint with @IanZhu123, @SihanL_, Koichiro Niinuma and my advisors @kkitani and @LaszloJeni . Paper: Project Page: Code:
Tweet card summary image
github.com
Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers". - rccchoudhury/rlt
0
6
73
@rchoudhury997
Rohan Choudhury
9 months
RLT leads to a 30%+ speedup in training and in inference (yes, it works on pre-trained transformers!) We observe virtually no degradation in performance, suggesting that transformers can effectively operate on this compressed version of the input.
1
1
35
@rchoudhury997
Rohan Choudhury
9 months
Because RLT doesn’t require running a model to identify repeated patches, we can figure out the new, variable length of the video before running the transformer, allowing us to seamlessly handle large batches with block-diagonal attention masks.
1
1
24
@rchoudhury997
Rohan Choudhury
9 months
This removes patches / tokens that are duplicated, which semantically corresponds to regions with no or very little motion over time.
3
2
44
@rchoudhury997
Rohan Choudhury
9 months
Our key idea: we identify temporally repeated patches frome the input, and remove these duplicates before running the model. Like run-length encoding, we add a new positional encoding to tell the transformer how long each “run” is.
Tweet media one
3
1
54
@rchoudhury997
Rohan Choudhury
9 months
Vision transformers split input videos into equal sized patches - the number of tokens depends only on the number of frames and their resolution. But some videos are more complex than others - does lofi girl need same number of tokens as Obito vs Kakashi?
1
4
44
@rchoudhury997
Rohan Choudhury
9 months
I've been using Windsurf for the past two weeks, and it's probably at least 2xed my programming speed. I wrote my entire personal website with Cascade and maybe manually edited like 2 lines. Can't recommend this tool enough !.
@WindsurfCurrent
Windsurf Current
9 months
Today we’re excited to launch the Windsurf Editor - the first agentic IDE, and then some 🏄. In Windsurf, we have given the AI a previously unseen combination of deep codebase understanding, powerful set of tools, and real time awareness of your in-editor actions. The result? A
2
0
15
@rchoudhury997
Rohan Choudhury
10 months
more amazing work from @UksangYoo ! as always, a fantastic demo and fun paper name :).
@UksangYoo
Uksang Yoo
10 months
Can robots make pottery🍵? Throwing a pot is a complex manipulation task of continuously deforming clay. We will present RoPotter, a robot system that uses structural priors to learn from demonstrations and make pottery @HumanoidsConf @CMU_Robotics.👇1/8🧵
0
1
3
@rchoudhury997
Rohan Choudhury
10 months
really amazing work from @mihdalal . I think zero shot methods can potentially help scale robotics significantly better in the near future :).
@mihdalal
Murtaza Dalal
10 months
Can my robot cook my food, rearrange my dresser, tidy my messy table and do so much more without ANY demos or real-world training data?. Introducing ManipGen: A generalist agent for manipulation that can solve long-horizon robotics tasks entirely zero shot, from text input!.1/N
0
1
4
@rchoudhury997
Rohan Choudhury
11 months
RT @LaszloJeni: 🚀 At #ECCV2024? Explore zero-shot video QA on long videos at our ProViQ poster! We leverage procedural reasoning to master….
0
2
0