
Kumara Kahatapitiya
@kkahatapitiy
Followers
572
Following
13K
Media
20
Statuses
985
Research Scientist @AIatMeta | PhD @SBUcompsc
Stony Brook, NY
Joined November 2012
Will be presented at #ICCV2025 đź’«.
Introducing AdaCache, a training-free inference accleration method for video DiTs. It allocates compute tailored to each video generation, maximizing quality-latency trade-off. project-page: code: arxiv:
2
1
20
RT @ycombinator: Andrej Karpathy's (@karpathy) keynote yesterday at AI Startup School in San Francisco.
0
1K
0
RT @XiangLi54505720: (1/5).Excited to present our #ICLR2025 paper, LLaRA, at NYC CV Day!.LLaRA efficiently transforms a pretrained Vision-L….
0
7
0
RT @kahnchana: Two papers accepted @iclr_conf . (1) MVU for Long Video QnA: (2) LLaRA: Large Language and Robotic….
0
2
0
RT @ryoo_michael: I am extremely pleased to announce that CoRL 2025 will be in Seoul, Korea! The organizing team includes myself and @gupta….
0
14
0
RT @_akhaliq: Meta presents Adaptive Caching for Faster Video Generation with Diffusion Transformers
0
94
0
This is a collaboration at Meta GenAI. I thank my co-authors @HaoZhe65347, Sen He, Ding Liu, @menglin_jia, Chenyang Zhang, @ryoo_michael , and Tian Xie.
0
0
2
RT @HaoZhe65347: Introducing MarDini, a model built from scratch to combine the strengths of diffusion and masked auto-regressive approache….
0
9
0
RT @liu_shikun: Introducing MarDini 🍸 -- our latest exploration in video diffusion models from @AIatMeta!. MarDini brings an asymmetric des….
0
16
0
RT @_akhaliq: Meta presents MarDini. Masked Autoregressive Diffusion for Video Generation at Scale
0
75
0
RT @twelve_labs: The webinar recording of this session with @jongwoopark7978, @kkahatapitiy, and @kahnchana is up!. Watch here: https://t.c….
0
3
0
RT @twelve_labs: In the 55th session of #MultimodalWeekly, we have three Ph.D candidates from @stonybrooku working on long-form video under….
0
1
0
RT @jongwoopark7978: 🚀 Check out our new arXiv release!. We've demonstrated the effectiveness of the Hierarchical Keyframe Selector for ver….
0
2
0
RT @arankomatsuzaki: AI2 presents Theia: Distilling Diverse Vision Foundation Models for Robot Learning. Outperforms its teacher models and….
0
40
0
OCD will appear at #ECCV2024 🥳. More info here:
qualcomm-ai-research.github.io
Qualcomm AI Research
Check out our recent work Object-Centric Diffusion, which makes video editing pipelines much faster, in zero-shot!. TL;DR.Without sacrificing quality, we,.1. reduce sampling steps for background. 2. process fewer tokens, w/ the majority coming from foreground. More details soon.
1
4
40
Introducing LLaRA ✨. A complete recipe for converting a VLM into a robot policy: from data curation, finetuning to real-robot execution, all open-sourced NOW!. Our experiments show the benefits of auxiliary data (eg: spatial/temporal reasoning) on learning policy. Have fun!.
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy. SotA performance on robot manipulation tasks, outperforming RT-2-like approach. repo: abs:
0
6
37
RT @XiangLi54505720: 🚀 Excited to share our latest project: LLaRA - Supercharging Robot Learning Data for Vision-Language Policy! 🤖✨. We cr….
0
20
0