wufeima Profile Banner
Wufei Ma Profile
Wufei Ma

@wufeima

Followers
121
Following
319
Media
22
Statuses
85

PhD student at @CCVLatJHU @JHU | Student researcher at Google | Prev: Meta, Amazon, MSRA, Megvii

Baltimore, MD
Joined August 2019
Don't wanna be here? Send us removal request.
@lmthang
Thang Luong
2 months
Yes, there is an official marking guideline from the IMO organizers which is not available externally. Without the evaluation based on that guideline, no medal claim can be made. With one point deducted, it is a Silver, not Gold.
@Mihonarium
Mikhail Samin
2 months
🚨 According to a friend, the IMO asked AI companies not to steal the spotlight from kids and to wait a week after the closing ceremony to announce results. OpenAI announced the results BEFORE the closing ceremony. According to a Coordinator on Problem 6, the one problem OpenAI
Tweet media one
14
57
588
@DrDominicNg
Dr. Dominic Ng
2 months
Microsoft claims their new AI framework diagnoses 4x better than doctors. I'm a medical doctor and I actually read the paper. Here's my perspective on why this is both impressive AND misleading ... 🧵
Tweet media one
277
1K
9K
@CCVLatJHU
CCVL at JHU
3 months
We are at the Johns Hopkins booth at @CVPR . Come join us 😁 @JHUCompSci @HopkinsEngineer @HopkinsDSAI
Tweet media one
1
6
23
@HopkinsDSAI
Johns Hopkins Data Science and AI Institute
3 months
Hopkins researchers including @JHUECE Tinoosh Mohsenin and @JHU_BDPs Rama Chellappa are speaking at booth 1317 of the IEEE / CVF Computer Vision and Pattern Recognition Conference today! Come meet #HopkinsDSAI #CVPR2025
Tweet media one
2
15
31
@Voxel51
Voxel51
3 months
One of the biggest bottlenecks in deploying visual AI and computer vision is annotation, which can be both costly and time-consuming. Today, we’re introducing Verified Auto Labeling, a new approach to AI-assisted annotation that achieves up to 95% of human-level performance while
2
212
109
@CCVLatJHU
CCVL at JHU
3 months
Excited to announce that our group will be presenting eight papers at @CVPR in Nashville! 🎉 We're excited to share the ideas we've been working on with the community. If you'll be there, we'd love to meet and chat -- always happy to exchange ideas and catch up in person. See
Tweet media one
0
6
6
@ashtom
Thomas Dohmke
4 months
GitHub Copilot now has a coding agent embedded right where you already collaborate with developers: on GitHub. And yes, you can access it from VS Code too. 🤖
88
476
4K
@CCVLatJHU
CCVL at JHU
5 months
🤩Thrilled to share three papers from our group to be presented at @iclr_conf -- see you in Singapore!
Tweet media one
0
2
6
@JHUCompSci
JHU Computer Science
5 months
@XingruiWang & @YuilleAlan’s “Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering” w/ @wufeima, Angtian Wang, @an_epsilon0, & @AdamKortylewski constructs a dataset & model 4 temporal world representations: https://t.co/mF7oNy3tYK (5/12)
Tweet card summary image
arxiv.org
For vision-language models (VLMs), understanding the dynamic properties of objects and their interactions in 3D scenes from videos is crucial for effective reasoning about high-level temporal and...
1
2
2
@wufeima
Wufei Ma
8 months
🤮
Tweet media one
0
0
3
@CCVLatJHU
CCVL at JHU
8 months
🚀Our group is seeking summer interns to join our group in 2025 and work on exciting research projects in computer vision and AI. 🔗Job posting: https://t.co/qJ7WjmC4XJ 🤩If interested please send Prof @YuilleAlan an email! @HopkinsEngineer @JHUCompSci @HopkinsDSAI
0
10
16
@wufeima
Wufei Ma
9 months
Two more interesting examples from running Gemini 2.0 Flash Thinking on our 3DSRBench.
Tweet media one
Tweet media two
0
0
0
@wufeima
Wufei Ma
9 months
Estimating orientation from the perspective of a man? This is a challenging problem as models are often confused by 2D and 3D spatial reasoning. Gemini 2.0 gives a textbook reasoning of the problem. The answer is unfortunately wrong but very impressive thinking!
Tweet media one
1
0
0
@wufeima
Wufei Ma
9 months
Relative position from the perspective of a horse? Gemini 2.0 divides the problem into smaller steps, estimating orientation and relative position. It also comes up with ways to verify its answer, i.e., confirming with visual cues, and to eliminate other possibilities.
Tweet media one
1
0
0
@wufeima
Wufei Ma
9 months
Can Gemini 2.0 Flash Thinking handle complex 3D spatial reasoning? While not perfect, its spatial reasoning capabilities are truly remarkable!🤯 I test the model on several 3DSRBench questions ( https://t.co/NUrc5lGDGI). It's very impressive how Gemini 2.0 effectively breaks down
Tweet media one
@JeffDean
Jeff Dean
9 months
Introducing Gemini 2.0 Flash Thinking, an experimental model that explicitly shows its thoughts. Built on 2.0 Flash’s speed and performance, this model is trained to use thoughts to strengthen its reasoning. And we see promising results when we increase inference time
1
3
9
@JHUCompSci
JHU Computer Science
9 months
🧠 “I want algorithms that will work in the real world and that will perform at the level of humans, probably ultimately better. And to do that, I think we need to get inspired by the brain,” says @YuilleAlan. Learn more about his research here:
Tweet card summary image
cs.jhu.edu
Bestowing machines with the ability to perceive the physical world as humans do has been a career-long mission of Alan Yuille, a pioneer in the field of computer vision.
0
6
10
@jieneng_chen
Jieneng Chen
9 months
👏 Gemini 2.0 impresses with its visual and physical reasoning, but 3D spatial reasoning remains a challenge. 🔥We present 3DSRBench to benchmark the 3D spatial reasoning capabilities. 🫨 Surprisingly, Gemini 2.0 achieves only 50% accuracy, falling significantly short of
Tweet media one
@GoogleDeepMind
Google DeepMind
9 months
The Gemini 2.0 era is here. And we’re excited for you to start building with it. A quick rewind of what we just released ⏪ Gemini 2.0 Flash ⚡ comes with low latency and better performance. 🔵 You can now access an experimental version in @GeminiApp on the web, while Gemini
0
6
17
@wufeima
Wufei Ma
9 months
🤯Gemini 2.0 is great, especially how it sees in 3D and reasons about the physical world. However, 3D spatial reasoning still has a long way to go. 🔥We present 3DSRBench, a comprehensive 3D spatial reasoning benchmark with 2772 manually annotated VQAs across 12 question types.
Tweet media one
Tweet media two
2
5
24
@wufeima
Wufei Ma
9 months
🚀Excited to share our ImageNet3D dataset for general-purpose object-level 3D understanding, which augments 200 categories from ImageNet21k with 3D annotations. 🔗Project page: https://t.co/3iTfhESycX 🥰Big thanks to my advisors: @AdamKortylewski @yaoyaoliu1 @YuilleAlan
Tweet media one
Tweet media two
@CCVLatJHU
CCVL at JHU
9 months
🎇This week our group will be presenting four papers at @NeurIPSConf. Feel free to stop by our posters and check out the latest works from our group! ✉️DMs open for any questions and chats. 🥳Kudos to the authors and collaborators: @pedrorasb @Zongwei_Zhou @jieneng_chen
Tweet media one
0
4
13
@jieneng_chen
Jieneng Chen
10 months
Introducing Genex: Generative World Explorer. 🧠 Humans mentally explore unseen parts of the world, revising their beliefs with imagined observations. ✨ Genex replicates this human-like ability, advancing embodied AI in planning with partial observations. (1/6)
6
49
164