Kumar Ashutosh @chargedneutron_ X Profile

Kumar Ashutosh

@chargedneutron_

Followers

133

Following

107

Media

4

Statuses

57

CS PhD student at @UTAustin | Visiting Researcher @AIatMeta | @iitbombay alum (2016-21).

Austin, TX

Joined January 2020

Don't wanna be here? Send us removal request.

Kumar Ashutosh

@chargedneutron_

1 year

Introducing our #CVPR2024 (Highlight ✨) paper, "Detours for Navigating Instructional Videos.". Imagine watching an instructional video and having questions like, "Can I add carrots here?" Our novel VidDetours enable video detours that use user queries and prior viewing context.

1

5

Kumar Ashutosh

@chargedneutron_

20 days

RT @geopavlakos: Ashutosh @chargedneutron_ is presenting ExpertAF during this poster session! We are at poster #280. Come by to chat about….

0

1

0

Kumar Ashutosh

@chargedneutron_

5 months

Check out our new work in enabling LLMs to see and hear _without_ any training. Paper: Code:

Rohit Girdhar

@_rohitgirdhar_

5 months

Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!

0

5

Kumar Ashutosh

@chargedneutron_

1 year

Check out VidOSC, to be presented at CVPR 2024 🎉.

0

2

Kumar Ashutosh

@chargedneutron_

1 year

Work done at UT Austin and @AIatMeta. Collaborators: @sherryx90099597, @TusharNagarajan and Kristen Grauman.

0

1

Kumar Ashutosh

@chargedneutron_

1 year

We have created a weakly-supervised dataset and a manually annotated testing dataset for this new task. Paper: Project page: Code: .

1

0

1

Kumar Ashutosh

@chargedneutron_

1 year

This method is challenging because it requires jointly reasoning about prior user viewing context and user queries. Additionally, the model needs to reason across minutes-long videos. Our novel method addresses these technical challenges.

1

0

1

Kumar Ashutosh

@chargedneutron_

1 year

This task is crucial for skill-learning, as learners often face the dilemma of guessing and continuing—risking breaking the recipe—or searching for a new video, which may not differ significantly from the current one.

1

0

1

Kumar Ashutosh

@chargedneutron_

1 year

🚀🚀.

#CVPR2025

@CVPR

1 year

HUGE shoutout to our #CVPR2024 Outstanding Reviewers 🫡

0

4

Kumar Ashutosh

@chargedneutron_

2 years

We evaluate our method on several video datasets on step forecasting, step recognition and task recognition and see consistent gains over prior work. Project page: Please come by the poster session today (12/12 morning session) at #129.

0

2

Kumar Ashutosh

@chargedneutron_

2 years

Introducing our #NeurIPS23 paper on using video-mined task graphs for keystep recognition in instructional videos. 🎉. We first use videos on the internet to learn a task graph and then use the learned graph to regularize keystep recognition in instructional videos.

1

0

7

Kumar Ashutosh

@chargedneutron_

2 years

Webpage: Paper: Blog:

0

Kumar Ashutosh

@chargedneutron_

2 years

Exciting collaboration between 14 research institutions resulting in a first-of-its kind dataset pushing the frontiers in skill understanding. Covers cooking, sports, dance and health - thus enabling research in multiple domains and tasks. Glad to be a part of this effort!🎉.

AI at Meta

@AIatMeta

2 years

1️⃣ Ego-Exo4D.A new foundational dataset + benchmark suite to support research on video learning & multimodal perception, co-developed with 14 university partners. Details ➡️ Core to the work is videos of skilled human activities, simultaneously capturing

1

0

4

Kumar Ashutosh

@chargedneutron_

2 years

Exciting to have both squash and cricket in the LA Olympics 🥳.

The Olympic Games

@Olympics

2 years

IOC Session approves @LA28’s proposal for 5⃣ additional sports:. ⚾Baseball/🥎softball, 🏏cricket, 🏈flag football, 🥍lacrosse and ⚫squash have been officially included as additional sports on the programme for the Olympic Games Los Angeles 2028. #LA28

0

2

Kumar Ashutosh

@chargedneutron_

2 years

RT @lexfridman: Here's my conversation with Mark Zuckerberg, his 3rd time on the podcast, but this time we talked in the Metaverse as photo….

0

8K

0

Kumar Ashutosh

@chargedneutron_

2 years

RT @lexfridman: Beautiful 60-day timelapse of a bird building a nest and raising her kids until they fly away from home. This is the magic….

0

3K

0

Kumar Ashutosh

@chargedneutron_

2 years

Check out our highlight paper during the poster session on Thursday afternoon (Board: 235). Work done in collaboration with UT Austin and @MetaAI . Paper: Code:

AI at Meta

@AIatMeta

2 years

📺 HierVL is a novel hierarchical video-language embedding that simultaneously accounts for both long-term and short-term associations. Paper ➡️ 7/7

0

6

Kumar Ashutosh

@chargedneutron_

2 years

Our highlight paper (top 2.5%) will be presented on Thursday afternoon (6/22) poster session (THU-PM-235) and also in the @LSHVU workshop on Sunday (6/18) morning. Reach out to me if you are interested in this work!.

0

1

Kumar Ashutosh

@chargedneutron_

2 years

Paper: Project Page: Code: Work with @_rohitgirdhar_ , Lorenzo Torresani, Kristen Grauman at UT Austin and @MetaAI.

1

0

2

Kumar Ashutosh

@chargedneutron_

2 years

We utilize the learned representations for various downstream tasks, including Ego4D Long Term Anticipation, Charades-Ego Action Classification, EPIC-KITCHENS-100 Multi-Instance Retrieval, and HowTo100M Long Video Classification, achieving state-of-the-art results.

1

0