Maksym Andriushchenko @ NeurIPS
@maksym_andr
Followers
5K
Following
15K
Media
385
Statuses
2K
Faculty at @ELLISInst_Tue & @MPI_IS. PhD from @EPFL supported by Google & OpenPhil PhD fellowships.
Tübingen, Deutschland
Joined April 2018
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start
76
90
831
We're delighted to welcome @maksym_andr as an invited speaker. It'll be happening today, starting from 12 pm at the San Diego Convention Center, Room Upper 31ABC. If you're attending @NeurIPSConf, you're more than welcome to join us to connect with awesome people #NeurIPS2025
I'm giving a talk at the New in ML NeurIPS Workshop on Tuesday (2:40–3:25 pm PST)! I've compiled some opinionated but hopefully useful advice based on my research experience :) Really looking forward to the workshop!
0
3
7
👏 Give a big round of applause to our 2025 PhD Award Winners! The two main winners are: @ZhijingJin & @maksym_andr. Two runners-up were selected additionally: @SiweiZhang13 & @elias_frantar Learn even more about each outstanding scientist: https://t.co/3Pry7NnAYn
2
5
43
I'm giving a talk at the New in ML NeurIPS Workshop on Tuesday (2:40–3:25 pm PST)! I've compiled some opinionated but hopefully useful advice based on my research experience :) Really looking forward to the workshop!
4
8
79
Apply to work with Maksym! I spent a few days at @ELLISInst_Tue with the group, and the atmosphere was fantastic. You’d be surrounded by great PIs and really strong students, working on exciting projects in AI safety and security.
📣 We are expanding our AI Safety and Alignment group at @ELLISInst_Tue and @MPI_IS! We have: - a great cluster at MPI with 50+ GB200s, 250+ H100s, and many-many A100 80GBs, - outstanding colleagues (@jonasgeiping, @sahar_abdelnabi, etc), - competitive salaries (as for
0
1
14
3/n OS-Harm: A safety benchmark for computer-use agents covering 150 tasks across harassment, copyright infringement, disinformation, and data exfiltration. Testing reveals that even frontier models like Claude 3.7 Sonnet and Gemini 2.5 Pro often comply with misuse requests,
1
1
3
If you are interested, please fill out this Google form. I will review every application and reach out if there is a good fit. https://t.co/MPbeWRrNIR
docs.google.com
Hi 👋! My name is Maksym Andriushchenko. I'm starting my research group at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025 that focuses on AI safety...
1
0
6
📣 We are expanding our AI Safety and Alignment group at @ELLISInst_Tue and @MPI_IS! We have: - a great cluster at MPI with 50+ GB200s, 250+ H100s, and many-many A100 80GBs, - outstanding colleagues (@jonasgeiping, @sahar_abdelnabi, etc), - competitive salaries (as for
4
13
146
I'll be at NeurIPS in San Diego from Monday to Saturday! I'd be glad to chat about AI safety and alignment—feel free to reach out via DMs or email! Also, four of my students are coming to EurIPS in Copenhagen: - Ben Rank @full__rank - David Schmotz @DavidSchmotz - Jeremy Qin
1
6
83
Very pleased to have participated in writing the Second Key Update to the International AI Safety Report! This update focuses on technical safeguards and risk management, communicating the current state of AI safety in an accessible way for a broad audience.
I’m pleased to share the Second Key Update to the International AI Safety Report, which outlines how AI developers, researchers, and policymakers are approaching technical risk management for general-purpose AI systems. (1/5)
0
2
32
Keynote speakers: (1) Irene Chen (@irenetrampoline) from UC Berkeley (2) Pavel Izmailov (@pavelizmailovai) from NYU (3) Maksym Andriushchenko (@maksym_andr) from MPI & ELLIS Institute (4) Pratyusha Sharma (@pratyusha_PS) from Microsoft (5) Haifeng Xu (@haifengxu0) from UChicago
1
2
20
Gemini 3 Pro is superhuman on GeoGuessr (as in better than professional players) according to https://t.co/MPz9xSEPCP What does it mean to the future of privacy? 😬
5
0
31
‘Maybe the “reviewer” is an LLM that I can prompt-inject” (from Gemini 3 System Card regarding LLM-based oversight) We happen to have a paper exactly about that…
arxiv.org
AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a security problem, stress testing with exploits...
2
3
27
i like how standard instructions in Agent Skills look a lot like prompt injections :D (“CRITICAL”, “IMPORTANT”, “Remember:”,“NEVER use …”)
this is amazing. Anthropic not only understands how to build the best models but also how to use them best. just look at this frontend-design skill. it's just one file with 42 lines of instructions that read like the type of memo a frontend lead would write for their team.
3
2
6
this Claude Code misuse case serves as strong motivation for our recent work on monitoring decomposition attacks:
arxiv.org
Current LLM safety defenses fail under decomposition attacks, where a malicious goal is decomposed into benign subtasks that circumvent refusals. The challenge lies in the existing shallow safety...
Frontier AI labs should immediately apply the lightweight sequential monitors described in our paper, "Monitoring decomposition attacks in LLMs with lightweight sequential monitors." These attacks are already being used in active cyber-espionage campaigns.
2
2
14