Douglas Gray
@de3ug
Followers
464
Following
11K
Media
244
Statuses
2K
Visual Search @amazon. Here for computer vision, ML, NLP research discussion, but may drift into hobbies like astrophotography.
Portola Valley, CA
Joined June 2009
Business class was full, also it was a short flight and this is also fun I think it's more fun to mix very cheap and very luxury to feel the contrast I started to realize luxury business class, and luxury hotels, and all that, it just gets very boring very fast if you don't
@levelsio @CSAIRGlobal Bro how are you not wealthy enough to at least fly business class?
127
52
3K
The value of fast iteration in AI is overrated. The best results are obtained by knowing the right things to do and doing each thing with neurotic precision and attention to detail.
26
37
435
This bulge on a Waymo map seems funny. Which Google exec lives there?
107
98
11K
This is why I like fabric posters. Easy to fold up and bring home to decorate the lab.
0
0
1
Good reminder for why it’s important to attend the PAMI-TC meeting. If you don’t, well meaning people will vote for more bureaucratic requirements.
It looks like @CVPR has implemented a new mandatory "Compute Reporting Form" that must be submitted alongside any paper submission. Though I am sympathetic to the motivations for this change, I am opposed to it for a variety of reasons:
1
0
2
Huge thanks to everyone who made MRR @ ICCV possible — our authors, reviewers, attendees, and speakers: Roi Herzig, Cordelia Schmid, Jianwei Yang, Kristen Grauman, and the ICCV organizers. 🙏 See you next year! #ICCV2025 #AI #ComputerVision #Multimodal #Retrieval
0
0
0
It doesn’t just speed up OCR—it shows how vision and language models can share a single representational layer. The visual and text transformers operate on the same logic, blurring the line between seeing and reading.
1
0
0
Lots of interest in this work, but many people seem to be missing the point or arguing over who did it first, which says a lot about its importance. DeepSeek-OCR is a big step because it’s a true multimodal model, not a case of attaching a Vision Transformer to an LLM.
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter. The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language
2
0
2
Lots of folks eating lunch on the floor. @ICCVConference should really get more chairs near the food. #ICCV2025
1
2
33
🌺Super excited to present our work on Structured Physical Intelligence at the MRR 2025 workshop at ICCV 2025! 🗓️ Monday, October 20 | 8:35 📷Room 308 B 🎉🤖 I'll share on how robots can perceive, reason, and adapt in the physical world. 🌍 @ICCVConference
1
2
7
Why attend MRR? In the age of GenAI, retrieval is a key component in grounding AI responses in reality. Now multiple fields are converging on this one subject. https://t.co/yMbYi7k4Rb
More than 1100 #ICLR2026 submissions mention "retrieval" in their title or abstract 🤯
0
0
1
Why attend MRR @ #ICCV2025? Multimodal retrieval, RAG, and agentic AI—plus coffee + posters (9:50–10:35). Mon 8:30–12:30 (Room 308B). #Multimodal #RAG #Retrieval #ICCV
1
1
1
At #ICCV2025? Come join us for the #Multimodal #Retrieval & #Representation (MRR) Workshop -> Mon Oct 20, 8:30–12:30 (Room 308B, Honolulu). Keynote talks from @CordeliaSchmid, @jw2yang4ai, Kristen Grauman, plus an invited talk from Roi Herzig. #AI #ComputerVision
2
3
5