Neel Joshi
@neelsj
Followers
385
Following
46
Media
19
Statuses
447
Researcher at Microsoft working in computer vision and machine learning
Seattle, WA, USA
Joined April 2009
My team at Microsoft Research, working in multimodal, AI is hiring! Please apply if you are interested in working at the cutting edge of multimodal generative AI.
0
8
25
📌 You can now find all the evaluation logs (and reasoning traces for common benchmarks!) from our inference-time scaling report and the Phi-4 reasoning report at  https://t.co/skNOjClLxQ. The evaluation code can be found at Eureka ML Insights: https://t.co/FjviLeU889.
github.com
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. - microsoft/eureka-ml-insights
1
13
56
One of the secret weapons we had in doing the phi-4-resoning report is Eureka: https://t.co/Ncp7ZhbQr9 Eureka and our eval team have been doing an amazing job with adding new benchmarks to doing deeper analysis of results beyond single-score statistics.
github.com
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. - microsoft/eureka-ml-insights
BTW when you test AIME or HMMT, or anything that has ~30 questions, PLEASE DO 50-100 runs, and report error bars!! The pass at 1 for 5 runs is super noisy... Look at the variance even for 50 runs
1
4
15
We’ve been cooking... a new open weights 14B Phi-4 reasoning model, SFT’d on ~1.4M carefully curated reasoning demonstrations from o3-mini and RL’d for a tiny bit. This model is a little beast.
36
236
1K
I am thrilled to share our newest Phi models. This time we went all in on post-training to produce Phi-4-reasoning (SFT only) and Phi-4-reasoning-plus (SFT + a touch of RL) — both 14B models that pack a punch in a small size across reasoning and general purpose benchmarks🧵
3
21
77
Tech Report: https://t.co/DWZjnO3SsU Azure Foundry: https://t.co/fAf6HaiI1g HF: https://t.co/7kSAmy4kZ0
https://t.co/jITBpUXWUq
https://t.co/zxkgBKknse
0
2
4
A new post: Headroom for AI Development https://t.co/smJu33g02M . It's quite interesting to compare biological and silicon capabilities.
0
6
16
Announcing AutoGen 0.4, fully reimagined library for building advanced agentic AI systems, developed to improve code quality and robustness. Its asynchronous, event-driven architecture is designed to support dynamic, scalable workflows. Learn more: https://t.co/N7iSeR7ZJk
17
173
700
Excited to announce the release of Eureka, an open-source framework for evaluating and understanding large language and multimodal models! I’m really proud of the team. This is important work and crucial to have it be public and open.
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings. https://t.co/3MM6aZn09O
0
0
6
I am delighted to announce our HoloAssist challenges at the EgoVis ( https://t.co/YnOMwbXH6v)
#CVPR2024 workshop. HoloAssist is a large-scale egocentric human interaction dataset, where two people collaboratively complete physical manipulation tasks using HoloLens2.
7
2
6
Join us in shaping the future of AI! AI Frontiers is a lab inside Microsoft Research with a mission to unlock new AI capabilities and solve real-world problems. We are hiring researchers and engineers in our teams based in Redmond and NYC. Apply here:
microsoft.com
Mission of AI Frontiers Expand the pareto frontier of AI capabilities, efficiency, and safety through innovations in foundation models and learning agent platforms. Opens in a new tab
6
51
207
📣Spring hiring: AI Frontiers at Microsoft Research is looking for researchers passionate about in-depth understanding and rigorous evaluation of foundation models and multi-agent systems: https://t.co/P6Q3EcQcBy. Learn more about the group here: https://t.co/z0dIvqsq3l. This
0
23
73
HoloAssist is a new multimodal dataset consisting of 166 hours of interactive task executions with 222 participants. Discover how it offers invaluable data to advance the capabilities of next-gen AI copilots for real-world tasks: https://t.co/LQ6lFQxLrh
1
8
38
It's been an unbelievable journey. We didn't just hear your support, we felt it. None of this would have been possible without you. Thank you for everything.
1K
3K
23K
Grr, @AmericanAir cancelled our held reservation before the time the email said it was held for. Had to book a new ticket and it cost us $400 more. What’s the purpose of providing an option to hold tickets if you don’t actually hold them?!?
1
0
0
The Computer Vision Group @ MSR Redmond is hiring interns. Please spread the word and apply if you are interested! @MSFTResearch
0
0
2
How can we fully leverage online videos for self-supervised learning? Today, we are releasing ACAV100M, an automatic dataset curation pipeline and the largest video dataset for self-supervised video representation learning. https://t.co/QA8wMy8WVn
3
20
67
At Microsoft Research, we aim to empower the next generation of exceptional computing research talent. Today, we're thrilled to announce and congratulate this year's Microsoft Research PhD Fellowship recipients from around the world: https://t.co/pg6IpkiDEf
3
31
102