Andreas Mueller
@amuellerml
Followers
48K
Following
11K
Media
218
Statuses
9K
Machine learner, Python geek and scikit-learn developer. Principal Research SDE @AzureData @Microsoft. Posting on LinkedIn now.
Santa Cruz Mountains
Joined January 2012
📢 We are excited to announce "#FMSD: 1st Workshop on Foundation Models for Structured Data" has been accepted to #ICML 2025! Call for Papers: https://t.co/GwnimFbpPb
1
14
21
Open protocols like A2A and MCP are key to enabling the agentic web. With A2A support coming to Copilot Studio and Foundry, customers can build agentic systems that interoperate by design.
We are entering a new era in business where agents will not only act independently but also work together as a team. To bring this vision to life, open protocols like Model Context Protocol (MCP) and Agent2Agent (A2A) are essential for agent interoperability. I am excited to
86
224
2K
New preprint https://t.co/BUpLi4a39k Open Challenges in Time Series Anomaly Detection: An Industry Perspective This is a vision paper about what I think it missing from current research in time series anomaly detection, and how it could align better with practical applications.
arxiv.org
Current research in time-series anomaly detection is using definitions that miss critical aspects of how anomaly detection is commonly used in practice. We list several areas that are of practical...
3
2
9
The data science revolution is getting closer. TabPFN v2 is published in Nature: https://t.co/Ybb15pnZ5P On tabular classification with up to 10k data points & 500 features, in 2.8s TabPFN on average outperforms all other methods, even when tuning them for up to 4 hours🧵1/19
36
251
1K
Today OpenAI announced o3, its next-gen reasoning model. We've worked with OpenAI to test it on ARC-AGI, and we believe it represents a significant breakthrough in getting AI to adapt to novel tasks. It scores 75.7% on the semi-private eval in low-compute mode (for $20 per task
202
2K
9K
Congrats to my friends at @Microsoft on getting Python in Excel to GA! https://t.co/Lb54oobiR1
techcommunity.microsoft.com
Python in Excel is now generally available for Windows users of Microsoft 365 Business and Enterprise. Last August, in partnership with Anaconda, we...
1
6
53
There's a great overview of challenges and proposals here: Great summary here: NeurIPS 2023 Tutorial ( https://t.co/MisRx5yGdq) (check slides and doc). If you agree that something needs to change, I'd suggest talking to a PC member or conference organizer.
1
1
7
I'm pretty frustrated with the current review process in ML (both from an author, reviewer and meta-reviewer perspective). There's possible solutions or at least experiments and changes, but I feel like business as usual is no longer feasible.
3
1
22
Wondering how humans should be involved in designing #AutoML solutions 🤔? Check out our #ICML2024 paper: "Position: A Call to Action for a Human-Centered AutoML Paradigm"! 📄✨ https://t.co/TmB1p7HIhw Drop by at our poster on Thu, Jul 25 at 11:30 AM in Hall C 4-9 #2003 📅 1/3
proceedings.mlr.press
Automated machine learning (AutoML) was formed around the fundamental objectives of automatically and efficiently configuring machine learning (ML) workflows...
7
4
12
We are proud to release the first major version of DuckDB, v1.0.0, codenamed "Snow Duck". This version is a culmination of almost six years of research and development. Today we are shipping an innovative database system with a backwards-compatible storage format. Check out our
24
273
995
Columnar file formats like Parquet/ORC are ubiquitous. Our VLDB paper with @XinyuZeng218 + @huanchenzhang + @wesmckinn studies their internals. TLDR: They're not optimized for modern hardware. Something new is needed. Paper: https://t.co/zU0jUm78ot Code: https://t.co/nvbPUsRrpI
11
151
754
Not only is this the best model in the world, but it's available for free in ChatGPT, which has never before been the case for a frontier model.
29
62
897
GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot 🙂. Here’s how it’s been doing.
178
845
5K
Congrats to @AIatMeta on Llama 3 release!! 🎉 https://t.co/fSw615zE8S Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we'll see when the rankings come in @ @lmsysorg :)) 400B is still training, but already encroaching
138
1K
8K
🥁 Llama3 is out 🥁 8B and 70B models available today. 8k context length. Trained with 15 trillion tokens on a custom-built 24k GPU cluster. Great performance on various benchmarks, with Llam3-8B doing better than Llama2-70B in some cases. More versions are coming over the next
213
1K
7K
We often get questions around why @VoltronData supports the Ibis project -- we've answered them here! TL;DR: open standards are critical for the composable data ecosystem and tightly coupling Python dataframes to execution engines is bad for everyone https://t.co/VGK1BpBMSj
ibis-project.org
the portable Python dataframe library
2
9
24
The rumors are true! I started a(nother) blog. https://t.co/cSy5rhfxwI The first post is an adaption of my talk, recalling the pas 10+ years of building open source standards and the lessons learned along the way.
sympathetic.ink
Over the last decade, I have been lucky enough to contribute to a few successful open source projects in the data ecosystem.
2
8
63
Kaggle's (@Kaggle) latest competition's top 11 highest scoring notebooks all use 🚀@AutoGluon AutoML🚀 to achieve their strong performance! When I said that AutoGluon 1.0 was the largest jump in the state-of-the-art in 4 years, I meant it. Competition: https://t.co/SKWIen7X1e
0
9
52