_sakshams_ Profile Banner
Saksham Suri Profile
Saksham Suri

@_sakshams_

Followers
771
Following
2K
Media
13
Statuses
130

Research Scientist @AiatMeta. Previously PhD @UMDCS, @MetaAI, @AmazonScience, @USCViterbi, @IIITDelhi, @IBMResearch. #computervision #deeplearning

California, USA
Joined January 2015
Don't wanna be here? Send us removal request.
@_sakshams_
Saksham Suri
22 days
Efficient Track Anything is accepted at #ICCV2025!!.
@fiandola
Forrest Iandola
8 months
[1/n] ๐—˜๐—ณ๐—ณ๐—ถ๐—ฐ๐—ถ๐—ฒ๐—ป๐˜ ๐—ง๐—ฟ๐—ฎ๐—ฐ๐—ธ ๐—”๐—ป๐˜†๐˜๐—ต๐—ถ๐—ป๐—ด from @Meta: interactive video segmentation and tracking on an iPhone!
0
0
8
@_sakshams_
Saksham Suri
3 months
Drop by our oral presentation and poster session to chat and learn about our video tokenizer with learned autoregressive prior. #ICLR2025.
@hywang66
Hanyu Wang
3 months
I will be presenting LARP at ICLR today. ๐ŸŽค Oral: 11:18 AM โ€“ 11:30 AM (UTC+8), Oral Session 3C.๐Ÿ–ผ๏ธ Poster: 3:00 PM โ€“ 5:30 PM (UTC+8), Hall 3 + Hall 2B, Poster #162. Youโ€™re very welcome to drop by for discussion and feedback!.
0
1
4
@_sakshams_
Saksham Suri
3 months
RT @AIatMeta: Today is the start of a new era of natively multimodal AI innovation. Today, weโ€™re introducing the first Llama 4 models: Llaโ€ฆ.
0
2K
0
@_sakshams_
Saksham Suri
6 months
๐Ÿ“ข Excited to announce LARP has been accepted to #ICLR2025 ! ๐Ÿ‡ธ๐Ÿ‡ฌ .Code and models are publicly available. Project page:
@hywang66
Hanyu Wang
9 months
๐Ÿš€ Introducing LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior! ๐ŸŒŸ. ๐Ÿ“„ Paper: ๐Ÿ“œ Project page: ๐Ÿ”— Code: Collaborators: @_sakshams_ , Yixuan Ren, @HaoChen_UMD , @abhi2610 . #GenAI
Tweet media one
1
2
34
@_sakshams_
Saksham Suri
8 months
RT @fiandola: [1/n] ๐—˜๐—ณ๐—ณ๐—ถ๐—ฐ๐—ถ๐—ฒ๐—ป๐˜ ๐—ง๐—ฟ๐—ฎ๐—ฐ๐—ธ ๐—”๐—ป๐˜†๐˜๐—ต๐—ถ๐—ป๐—ด from @Meta: interactive video segmentation and tracking on an iPhone!
0
110
0
@_sakshams_
Saksham Suri
8 months
Checkout Efficient Track Anything from our team. 2x faster than SAM2 on A100 .> 10 FPS on iPhone 15 Pro Max. Paper: demo:
@YoungXiong1
Yunyang Xiong
8 months
๐Ÿš€Excited to share our Efficient Track Anything. It is small but mighty, >2x faster than SAM2 on A100 and runs > 10 FPS on iPhone 15 Pro Max. Howโ€™d we do it? EfficientSAM + Efficient Memory Attention!. Paper: Project (demo): with:
Tweet media one
0
0
10
@_sakshams_
Saksham Suri
9 months
Checkout LARP, our work on creating a video tokenizer which is trained with an autoregressive generative prior. Code and models are open sourced!.
@hywang66
Hanyu Wang
9 months
๐Ÿš€ Introducing LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior! ๐ŸŒŸ. ๐Ÿ“„ Paper: ๐Ÿ“œ Project page: ๐Ÿ”— Code: Collaborators: @_sakshams_ , Yixuan Ren, @HaoChen_UMD , @abhi2610 . #GenAI
Tweet media one
0
1
9
@_sakshams_
Saksham Suri
9 months
We are happy to release our LiFT code and pretrained models! ๐Ÿ“ข. Code: Project Page: Here are some super spooky super resolved feature visualizations to make the season scarier ๐ŸŽƒ. Coauthors: @MatthewWalmer @kamalgupta09 @abhi2610
Tweet media one
@_sakshams_
Saksham Suri
10 months
We introduce LiFT, an easy to train, lightweight, and efficient feature upsampler to get dense ViT features without the need to retrain the ViT. Visit our poster @eccvconf #eccv2024 in Milan on Oct 1st (Tuesday), 16:30 (local), Poster: 79. Project Page:
Tweet media one
2
46
243
@_sakshams_
Saksham Suri
9 months
RT @YoungXiong1: ๐ŸšจVideoLLM from Meta!๐Ÿšจ.LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding. ๐Ÿ“Paper: https://tโ€ฆ.
0
73
0
@_sakshams_
Saksham Suri
10 months
Work done with @MatthewWalmer, @kamalgupta09 and @abhi2610 .
0
0
3
@_sakshams_
Saksham Suri
10 months
We see consistent gains across multiple tasks which require dense features. On applying it 4x it can generate crisp image resolution features.
1
0
3
@_sakshams_
Saksham Suri
10 months
LiFT is trained to generate 2x higher resolution features. It also uses the low resolution image to guide the upsampling. Further due to its modular design, it can be reapplied to its own outputs.
2
0
6
@_sakshams_
Saksham Suri
10 months
LiFT is a small module consisting of convolution and deconvolution layers and is trained with a self-supervised reconstruction loss over the features.
1
0
9
@_sakshams_
Saksham Suri
10 months
We introduce LiFT, an easy to train, lightweight, and efficient feature upsampler to get dense ViT features without the need to retrain the ViT. Visit our poster @eccvconf #eccv2024 in Milan on Oct 1st (Tuesday), 16:30 (local), Poster: 79. Project Page:
Tweet media one
6
149
949
@_sakshams_
Saksham Suri
10 months
Excited to announce that I have joined @AIatMeta as a Research Scientist where I will be working on model optimization. Also I will be at ECCV to present my work and am excited to meet and learn from everyone. Reach out if you are attending and would like to chat. Ciao ๐Ÿ‡ฎ๐Ÿ‡น.
17
6
211
@_sakshams_
Saksham Suri
1 year
That's a wrap! Happy to share that I have defended my thesis. Thankful for the insightful questions and feedback from my committee members @abhi2610,@zhoutianyi, @davwiljac, Prof. Espy-Wilson, and Prof. Andrew Zisserman.
Tweet media one
10
0
82
@_sakshams_
Saksham Suri
1 year
RT @deedydas: Thank you King Kohli ๐Ÿ‘‘ for all the memories. End of an era.
0
50
0
@_sakshams_
Saksham Suri
1 year
RT @abhi2610: Call for Papers: #INRV2024 Workshop on Implicit Neural Representation for Vision @ #CVPR2024! Topics: Compression, Representaโ€ฆ.
0
19
0
@_sakshams_
Saksham Suri
1 year
RT @AnthropicAI: Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art modelsโ€”Claude 3 Opus, Claโ€ฆ.
0
2K
0
@_sakshams_
Saksham Suri
1 year
RT @StabilityAI: Announcing Stable Diffusion 3, our most capable text-to-image model, utilizing a diffusion transformer architecture for grโ€ฆ.
0
1K
0