Gabriele Berton
@gabriberton
Followers
7K
Following
6K
Media
380
Statuses
2K
Postdoc @Amazon working on VLM - ex @CarnegieMellon @PoliTOnews @IITalk
Joined December 2021
This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical
46
341
3K
🔥 Our paper SANSA is a #NeurIPS2025 Spotlight! We turn #SAM2 into a semantic few-shot segmenter for objects and parts, fully promptable (mask · point · box · scribble); only 10M trainable parameters and 5× faster than competitors. Code, models & demo https://t.co/bdfUd1YnlG 👇
1
10
12
@AnthropicAI is so efficient! In just a few hours they fixed the bug ;) They released Opus 4.5 (just a few hours after my post) which answers correctly, while Sonnet 4.5 does not
Claude doesn't know much about computational graphs, in fact it suggests to do the wrong thing entirely @AnthropicAI please add the tweet below in Claude's training data ;)
0
1
3
GPT5.1 and Gemini3 give the right answer, Claude doesn't Screenshots from GPT, Gemini, Claude in this order
Claude doesn't know much about computational graphs, in fact it suggests to do the wrong thing entirely @AnthropicAI please add the tweet below in Claude's training data ;)
0
0
6
Claude doesn't know much about computational graphs, in fact it suggests to do the wrong thing entirely @AnthropicAI please add the tweet below in Claude's training data ;)
This simple pytorch trick will cut in half your GPU memory use / double your batch size (for real). Instead of adding losses and then computing backward, it's better to compute the backward on each loss (which frees the computational graph). Results will be exactly identical
1
4
24
Can we guess that Soumith and Yann are leaving Meta because they were only gettin millions while "new joiners" are getting orders of magnitude more?
0
0
10
Happy to see image matching people working on astronaut photography! Great work from RoMa v2 @Parskatt
1
0
9
Went to buy a pair of shoes in Menlo Park and they first did a 3D reconstruction of my feet It felt surreal Of course I had to ask the shop assistant if he thought E2E methods would replace COLMAP one day
15
3
143
Anyone working on deep learning should know this by heart Especially 5/6
(1/n) How to start a deep learning project? We use a remarkably streamlined step-by-step process to set up deep learning projects. At the same time, people who are new to deep learning tend to always make the same (avoidable) mistakes. Check out the thread below! 🧵
3
16
354
Super interesting and hints to a possible direction to train more robust VLMs
VLMs (GPT-4o, Gemini, Qwen-VL, LLaVA…) look impressive — until you shift an image by 1 pixel. A tiny, meaning-preserving change → a completely different answer. This isn’t adversarial — it’s natural variation. Watch 👇
5
7
164
What's the take-home message? It is very likely that what you need is already out there. You don't always need to come up with novelty and a paper. Thoroughly benchmark existing baselines first, you'll find many answers there [4/4]
1
0
19
We found that SIFT was the best for our use case with a thorough benchmark of all image matching methods [1], where we also tried fairly unused methods. We found that (1) a now uncommon method was best and (2) most importantly, we didn't need to train a new model [3/4]
1
0
11
SIFT is rotation invariant by design, which is perfect for our use case We post-process SIFT features with LightGlue, which gives great results and no false positive Precision is 100%. This was one of the main requirements for the project [2/4]
1
0
14
Always fun to see people reaction when I tell them we're using SIFT features for AstroLoc That's right, a software deployed in 2025 at NASA uses SIFT, a method from 1999 Why SIFT? [1/4]
Excited to release the first worldwide aerial image localization method (and demo!) Take an aerial or satellite image from anywhere in the world, and AstroLoc can (probably) find its location, and provide a precise footprint! Links to paper, demo and full-length (5 min) video ⬇️
4
14
86
I should specify that this is a weird edge case, and that usually autocast helps (faster and lower memory). This probably happens because some ops in the cross entropy are computed in float32 for stability
0
0
2