
mok
@mokshith_v
Followers
1K
Following
2K
Media
135
Statuses
855
ai x video @sievedata
San Francisco, CA
Joined April 2018
the multi-angle generation is particularly cool to see. more spatial understanding more world model 🙏.
Introducing Runway Aleph, a new way to edit, transform and generate video. Aleph is a state-of-the-art in-context video model, setting a new frontier for multi-task visual generation, with the ability to perform a wide range of edits on an input video such as adding, removing
0
0
5
it has recently come to my attention that @signalapp is the superior messaging platform, and it's not even close.
1
0
2
Video as a foundation for AGI. I really like one of the ideas Demis explores in this conversation: Veo 3 as a world simulator directly challenging the need for physical embodiment on our path to true world understanding. More specifically it's interesting that scaling video.
Here's my conversation with @demishassabis, CEO of Google DeepMind, all about the future of AI & AGI, simulating biology & physics, video games, programming, video generation, world models, Gemini 3, scaling laws, compute, P vs NP, complexity, energy (solar & fusion), and much
0
1
16
so tl;dr - internet video is still useful but too many researchers are treating it as a complete replacement to the real thing and will soon learn a very bitter lesson because of it 🫡.
I wrote a fun little article about all the ways to dodge the need for real-world robot data. I think it has a cute title.
0
0
6
RT @thpicy: If you're a cracked infra engineer who:.- loves Go, ETL, OLAP DBs, k8s.- wants to change the future of video AI.- loves seeing….
0
6
0
The last two months have been insane. I randomly came into the office this morning and decided to record this video on why there has literally never been a more exciting time to join @sievedata. We're working with leading research teams pushing the frontier of creative, robotics,
3
7
79
RT @avinashj_: Just shipped Orbit. - Overlay running route & stats on photos.- Track your PRs across any timeframe.- Browse your run histo….
0
5
0
more evals for hard to evaluate things 💜.
We benchmarked 14 of the top AI dubbing tools, and the results shocked us. Even some of the popular names failed at preserving speaker identity or handling multi-speaker videos. For context, the evals were conducted by third-party native speakers across 8 languages.
0
0
4
i'm excited to finally share this with the world. foundation model improvements continue to allow our research team to make improvements up the stack and see material differences in output dubs. later this week we'll be sharing some updated, de-anonymized evals as well 🌎💜.
Introducing Dubbing 3.0 - the highest quality AI video translator. - Handles multi-speaker video better than any provider.- Expresses emotions better (e.g calm vs frustrated).- More natural, context-aware translations.- Supports 30+ languages and accents
0
0
12
RT @sievedata: Multi-modal models like GPT-4o are starting to outperform traditional models on core video/audio understanding tasks. One ex….
0
1
0
i am so excited for the role datasets like these will play in the future of robotics and gaming. unfortunate that this dataset in particular is only 720p, and not strictly curated towards more "interesting" scenarios, but still great work by the authors!.
Sekai: A Video Dataset towards World Exploration. A high-quality 5k hrs of egocentric worldwide video + audio dataset for world exploration, created from Youtube with high-quality annotations
0
0
6