Kumar Ashutosh Profile
Kumar Ashutosh

@chargedneutron_

Followers
133
Following
107
Media
4
Statuses
57

CS PhD student at @UTAustin | Visiting Researcher @AIatMeta | @iitbombay alum (2016-21).

Austin, TX
Joined January 2020
Don't wanna be here? Send us removal request.
@chargedneutron_
Kumar Ashutosh
1 year
Introducing our #CVPR2024 (Highlight ✨) paper, "Detours for Navigating Instructional Videos.". Imagine watching an instructional video and having questions like, "Can I add carrots here?" Our novel VidDetours enable video detours that use user queries and prior viewing context.
1
1
5
@chargedneutron_
Kumar Ashutosh
20 days
RT @geopavlakos: Ashutosh @chargedneutron_ is presenting ExpertAF during this poster session! We are at poster #280. Come by to chat about….
0
1
0
@chargedneutron_
Kumar Ashutosh
5 months
Check out our new work in enabling LLMs to see and hear _without_ any training. Paper: Code:
@_rohitgirdhar_
Rohit Girdhar
5 months
Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!
Tweet media one
0
0
5
@chargedneutron_
Kumar Ashutosh
1 year
Check out VidOSC, to be presented at CVPR 2024 🎉.
0
0
2
@chargedneutron_
Kumar Ashutosh
1 year
Work done at UT Austin and @AIatMeta. Collaborators: @sherryx90099597, @TusharNagarajan and Kristen Grauman.
0
0
1
@chargedneutron_
Kumar Ashutosh
1 year
We have created a weakly-supervised dataset and a manually annotated testing dataset for this new task. Paper: Project page: Code: .
1
0
1
@chargedneutron_
Kumar Ashutosh
1 year
This method is challenging because it requires jointly reasoning about prior user viewing context and user queries. Additionally, the model needs to reason across minutes-long videos. Our novel method addresses these technical challenges.
1
0
1
@chargedneutron_
Kumar Ashutosh
1 year
This task is crucial for skill-learning, as learners often face the dilemma of guessing and continuing—risking breaking the recipe—or searching for a new video, which may not differ significantly from the current one.
1
0
1
@chargedneutron_
Kumar Ashutosh
1 year
🚀🚀.
@CVPR
#CVPR2025
1 year
HUGE shoutout to our #CVPR2024 Outstanding Reviewers 🫡
Tweet media one
0
0
4
@chargedneutron_
Kumar Ashutosh
2 years
We evaluate our method on several video datasets on step forecasting, step recognition and task recognition and see consistent gains over prior work. Project page: Please come by the poster session today (12/12 morning session) at #129.
0
0
2
@chargedneutron_
Kumar Ashutosh
2 years
Introducing our #NeurIPS23 paper on using video-mined task graphs for keystep recognition in instructional videos. 🎉. We first use videos on the internet to learn a task graph and then use the learned graph to regularize keystep recognition in instructional videos.
Tweet media one
1
0
7
@chargedneutron_
Kumar Ashutosh
2 years
Webpage: Paper: Blog:
0
0
0
@chargedneutron_
Kumar Ashutosh
2 years
Exciting collaboration between 14 research institutions resulting in a first-of-its kind dataset pushing the frontiers in skill understanding. Covers cooking, sports, dance and health - thus enabling research in multiple domains and tasks. Glad to be a part of this effort!🎉.
@AIatMeta
AI at Meta
2 years
1️⃣ Ego-Exo4D.A new foundational dataset + benchmark suite to support research on video learning & multimodal perception, co-developed with 14 university partners. Details ➡️ Core to the work is videos of skilled human activities, simultaneously capturing
1
0
4
@chargedneutron_
Kumar Ashutosh
2 years
Exciting to have both squash and cricket in the LA Olympics 🥳.
@Olympics
The Olympic Games
2 years
IOC Session approves @LA28’s proposal for 5⃣ additional sports:. ⚾Baseball/🥎softball, 🏏cricket, 🏈flag football, 🥍lacrosse and ⚫squash have been officially included as additional sports on the programme for the Olympic Games Los Angeles 2028. #LA28
Tweet media one
0
0
2
@chargedneutron_
Kumar Ashutosh
2 years
RT @lexfridman: Here's my conversation with Mark Zuckerberg, his 3rd time on the podcast, but this time we talked in the Metaverse as photo….
0
8K
0
@chargedneutron_
Kumar Ashutosh
2 years
RT @lexfridman: Beautiful 60-day timelapse of a bird building a nest and raising her kids until they fly away from home. This is the magic….
0
3K
0
@chargedneutron_
Kumar Ashutosh
2 years
Check out our highlight paper during the poster session on Thursday afternoon (Board: 235). Work done in collaboration with UT Austin and @MetaAI . Paper: Code:
@AIatMeta
AI at Meta
2 years
📺 HierVL is a novel hierarchical video-language embedding that simultaneously accounts for both long-term and short-term associations. Paper ➡️ 7/7
Tweet media one
0
0
6
@chargedneutron_
Kumar Ashutosh
2 years
Our highlight paper (top 2.5%) will be presented on Thursday afternoon (6/22) poster session (THU-PM-235) and also in the @LSHVU workshop on Sunday (6/18) morning. Reach out to me if you are interested in this work!.
0
0
1
@chargedneutron_
Kumar Ashutosh
2 years
Paper: Project Page: Code: Work with @_rohitgirdhar_ , Lorenzo Torresani, Kristen Grauman at UT Austin and @MetaAI.
1
0
2
@chargedneutron_
Kumar Ashutosh
2 years
We utilize the learned representations for various downstream tasks, including Ego4D Long Term Anticipation, Charades-Ego Action Classification, EPIC-KITCHENS-100 Multi-Instance Retrieval, and HowTo100M Long Video Classification, achieving state-of-the-art results.
1
0
0