Benno Krojer @benno_krojer X Profile

Benno Krojer

@benno_krojer

Followers

2K

Following

51K

Media

331

Statuses

6K

AI phding @Mila_Quebec @mcgillu (past: @AIatMeta). Interests: interpretability, language grounding (V+L), evals, reasoning. Vanier Scholar. 🥏⚽🥨

Montréal, Québec

Joined June 2014

Don't wanna be here? Send us removal request.

Benno Krojer

@benno_krojer

22 days

Excited to share the results of my internship research with @AIatMeta, as part of a larger world modeling release!. What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?. And how can we instead curate shortcut-robust examples at a large-scale?. Details 👇🔬

AI at Meta

@AIatMeta

24 days

Our vision is for AI that uses world models to adapt in new and dynamic environments and efficiently learn new skills. We’re sharing V-JEPA 2, a new world model with state-of-the-art performance in visual understanding and prediction. V-JEPA 2 is a 1.2 billion-parameter model,

3

22

59

Benno Krojer

@benno_krojer

19 hours

RT @cohere: Cohere is excited to announce our new office in Montreal, QC!. We look forward to contributing to the local AI landscape, colla….

0

23

0

Benno Krojer

@benno_krojer

2 days

RT @lucasmaes_: I genuinely think @benno_krojer's work offers a much fairer and insightful way to assess the physics understanding of Vide….

0

1

0

Benno Krojer

@benno_krojer

4 days

Welcome to the lab, doctor!.

Verna Dankers

@vernadankers

4 days

I miss Edinburgh and its wonderful people already!! Thanks to @tallinzen and @PontiEdoardo for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join @sivareddyg's wonderful lab @Mila_Quebec 🤩.

0

2

Benno Krojer

@benno_krojer

9 days

RT @cesare_spinoso: A blizzard is raging in Montreal when your friend says “Wow, the weather is amazing!” Humans easily interpret irony, wh….

0

11

0

Benno Krojer

@benno_krojer

10 days

Also check out our previous two episodes! They didn't have a single guest, instead:. 1) we introduce the podcast and how Tom and I got into research in Ep 00.2) we interview several people at Mila just before the Neurips deadline about their submissions in Ep 01.

0

1

Benno Krojer

@benno_krojer

10 days

Started a new podcast with @tvergarabrowne !. Behind the Research of AI: .We look behind the scenes, beyond the polished papers 🧐🧪 . If this sounds fun, check out our first "official" episode with the awesome @gauthier_gidel from @Mila_Quebec:.

1

13

41

Benno Krojer

@benno_krojer

15 days

pretty plots sometimes

0

3

Benno Krojer

@benno_krojer

15 days

The video is online now!. 3min speed science talk on "From a soup of raw pixels to abstract meaning".

Benno Krojer

@benno_krojer

30 days

Turns out condensing your research into 3min is very hard but also teaches you a lot.

0

6

39

Benno Krojer

@benno_krojer

22 days

Cool use of our AURORA work from last year to improve physical world models framed as image editing!.

Yifu Qiu

@yifuqiu98

25 days

🔁 What if you could bootstrap a world model (state1 × action → state2) using a much easier-to-train dynamics model (state1 × state2 → action) in a generalist VLM?. 💡 We show how a dynamics model can generate synthetic trajectories & serve for inference-time verification. 🧵👇

1

3

6

Benno Krojer

@benno_krojer

22 days

RT @xhluca: "Build the web for agents, not agents for the web". This position paper argues that rather than forcing web agents to adapt to….

0

54

0

Benno Krojer

@benno_krojer

22 days

A: I think there is no a formal way to define it, it's up to us humans to say what a task is really about!. In most cases, like this paper on video reasoning, the distinction is easy. But not always. Here is a cool older paper on this:.

0

Benno Krojer

@benno_krojer

22 days

As a side note, during the project i thought a lot about what makes a solution that a model finds a *shortcut* vs just a "clever solution" that humans haven't thought of?.

Benno Krojer

@benno_krojer

22 days

Excited to share the results of my internship research with @AIatMeta, as part of a larger world modeling release!. What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?. And how can we instead curate shortcut-robust examples at a large-scale?. Details 👇🔬

2

0

2

Benno Krojer

@benno_krojer

22 days

This is part of a larger effort at meta to significantly improve physical world modeling so check out the other works in this blog post!.

0

2

Benno Krojer

@benno_krojer

22 days

Some reflections at the end:.There's a lot of talk about math reasoning these days, but this project made me appreciate what simple reasoning we humans take for granted, arising in our first months and years of living. As usual i also included "Behind The Scenes" in the Appendix:

2

0

7

Benno Krojer

@benno_krojer

22 days

I am super grateful to my smart+kind collaborators at Meta who made this a very enjoyable project :). @mido_assran Nicolas Ballas @koustuvsinha @candacerossio @garridoq_ Mojtaba Komeili. The Montreal office in general is a very fun place 👇

1

0

3

Benno Krojer

@benno_krojer

22 days

The hardest tasks for current models are still intuitive physics tasks where performance is often below random (In line with the prev. literature). We encourage the community to use MVPBench to check if the latest VideoLLMs possess a *real* understanding of the physical world!

1

0

3

Benno Krojer

@benno_krojer

22 days

On the other hand even the strongest sota models perform around random chance, with only 2-3 models significantly above random

1

0

4

Benno Krojer

@benno_krojer

22 days

The questions in MVPBench are conceptually simple: relatively short videos with little linguistic or cultural knowledge needed. As a result humans have no problem with these questions, e.g. it is known that even babies do well on various intuitive physics tasks

1

0

3

Benno Krojer

@benno_krojer

22 days

By automating the pairing of highly similar video pairs pairs and unifying different datasets, as well filtering out examples that models can solve with a single-frame, we end up with (probably) the largest and most diverse dataset of its kind:

1

0

3

Benno Krojer

@benno_krojer

22 days

So a solution we propose a 3-step curation framework that results in the Minimal Video Pairs benchmark (MVPBench)

1

0

3