maksym_andr Profile Banner
Maksym Andriushchenko @ NeurIPS Profile
Maksym Andriushchenko @ NeurIPS

@maksym_andr

Followers
5K
Following
15K
Media
385
Statuses
2K

Faculty at @ELLISInst_Tue & @MPI_IS. PhD from @EPFL supported by Google & OpenPhil PhD fellowships.

Tübingen, Deutschland
Joined April 2018
Don't wanna be here? Send us removal request.
@maksym_andr
Maksym Andriushchenko @ NeurIPS
4 months
🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨 Hiring. I'm looking for multiple PhD students: both those able to start
76
90
831
@NewInML
NewInML @ NeurIPS 2025
4 days
We're delighted to welcome @maksym_andr as an invited speaker. It'll be happening today, starting from 12 pm at the San Diego Convention Center, Room Upper 31ABC. If you're attending @NeurIPSConf, you're more than welcome to join us to connect with awesome people #NeurIPS2025
@maksym_andr
Maksym Andriushchenko @ NeurIPS
5 days
I'm giving a talk at the New in ML NeurIPS Workshop on Tuesday (2:40–3:25 pm PST)! I've compiled some opinionated but hopefully useful advice based on my research experience :) Really looking forward to the workshop!
0
3
7
@ELLISforEurope
ELLIS
4 days
👏 Give a big round of applause to our 2025 PhD Award Winners! The two main winners are: @ZhijingJin & @maksym_andr. Two runners-up were selected additionally: @SiweiZhang13 & @elias_frantar Learn even more about each outstanding scientist: https://t.co/3Pry7NnAYn
2
5
43
@maksym_andr
Maksym Andriushchenko @ NeurIPS
5 days
Full workshop schedule:
0
1
1
@maksym_andr
Maksym Andriushchenko @ NeurIPS
5 days
I'm giving a talk at the New in ML NeurIPS Workshop on Tuesday (2:40–3:25 pm PST)! I've compiled some opinionated but hopefully useful advice based on my research experience :) Really looking forward to the workshop!
4
8
79
@NKristina01_
Kristina Nikolić @ NeurIPS ✈️
7 days
Apply to work with Maksym! I spent a few days at @ELLISInst_Tue with the group, and the atmosphere was fantastic. You’d be surrounded by great PIs and really strong students, working on exciting projects in AI safety and security.
@maksym_andr
Maksym Andriushchenko @ NeurIPS
8 days
📣 We are expanding our AI Safety and Alignment group at @ELLISInst_Tue and @MPI_IS! We have: - a great cluster at MPI with 50+ GB200s, 250+ H100s, and many-many A100 80GBs, - outstanding colleagues (@jonasgeiping, @sahar_abdelnabi, etc), - competitive salaries (as for
0
1
14
@trycua
Cua @ NeurIPS 2025
8 days
3/n OS-Harm: A safety benchmark for computer-use agents covering 150 tasks across harassment, copyright infringement, disinformation, and data exfiltration. Testing reveals that even frontier models like Claude 3.7 Sonnet and Gemini 2.5 Pro often comply with misuse requests,
1
1
3
@maksym_andr
Maksym Andriushchenko @ NeurIPS
8 days
📣 We are expanding our AI Safety and Alignment group at @ELLISInst_Tue and @MPI_IS! We have: - a great cluster at MPI with 50+ GB200s, 250+ H100s, and many-many A100 80GBs, - outstanding colleagues (@jonasgeiping, @sahar_abdelnabi, etc), - competitive salaries (as for
4
13
146
@maksym_andr
Maksym Andriushchenko @ NeurIPS
8 days
I'll be at NeurIPS in San Diego from Monday to Saturday! I'd be glad to chat about AI safety and alignment—feel free to reach out via DMs or email! Also, four of my students are coming to EurIPS in Copenhagen: - Ben Rank @full__rank - David Schmotz @DavidSchmotz - Jeremy Qin
1
6
83
@maksym_andr
Maksym Andriushchenko @ NeurIPS
10 days
Very pleased to have participated in writing the Second Key Update to the International AI Safety Report! This update focuses on technical safeguards and risk management, communicating the current state of AI safety in an accessible way for a broad audience.
@Yoshua_Bengio
Yoshua Bengio
11 days
I’m pleased to share the Second Key Update to the International AI Safety Report, which outlines how AI developers, researchers, and policymakers are approaching technical risk management for general-purpose AI systems. (1/5)
0
2
32
@Arian_Khorasani
Arian Khorasani 🦅
13 days
Keynote speakers: (1) Irene Chen (@irenetrampoline) from UC Berkeley (2) Pavel Izmailov (@pavelizmailovai) from NYU (3) Maksym Andriushchenko (@maksym_andr) from MPI & ELLIS Institute (4) Pratyusha Sharma (@pratyusha_PS) from Microsoft (5) Haifeng Xu (@haifengxu0) from UChicago
1
2
20
@maksym_andr
Maksym Andriushchenko @ NeurIPS
16 days
Gemini 3 Pro is superhuman on GeoGuessr (as in better than professional players) according to https://t.co/MPz9xSEPCP What does it mean to the future of privacy? 😬
5
0
31
@maksym_andr
Maksym Andriushchenko @ NeurIPS
17 days
system card:
0
0
2
@maksym_andr
Maksym Andriushchenko @ NeurIPS
17 days
‘Maybe the “reviewer” is an LLM that I can prompt-inject” (from Gemini 3 System Card regarding LLM-based oversight) We happen to have a paper exactly about that…
Tweet card summary image
arxiv.org
AI control protocols serve as a defense mechanism to stop untrusted LLM agents from causing harm in autonomous settings. Prior work treats this as a security problem, stress testing with exploits...
2
3
27
@hrdkbhatnagar
Hardik Bhatnagar
18 days
🚨 Breaking @WeiboLLM's VibeThinker 1.5B leads the Sober Reasoning leaderboard for its size Punching way above its weight -- outperforming even 32B models 🔥 Outstanding work, @WeiboLLM team!
1
6
16
@maksym_andr
Maksym Andriushchenko @ NeurIPS
19 days
love this benchmark!
@andonlabs
Andon Labs
20 days
We re-ran Kimi K2 Thinking on Vending-Bench using Moonshot’s own API as it was suggested this would improve performance on tool calling. We found this to be true, as Kimi K2 is now the best open source model on Vending-Bench based on average net worth achieved.
0
0
5
@maksym_andr
Maksym Andriushchenko @ NeurIPS
20 days
i like how standard instructions in Agent Skills look a lot like prompt injections :D (“CRITICAL”, “IMPORTANT”, “Remember:”,“NEVER use …”)
@nityeshaga
Nityesh
22 days
this is amazing. Anthropic not only understands how to build the best models but also how to use them best. just look at this frontend-design skill. it's just one file with 42 lines of instructions that read like the type of memo a frontend lead would write for their team.
3
2
6
@maksym_andr
Maksym Andriushchenko @ NeurIPS
20 days
paper abstract:
0
0
2
@maksym_andr
Maksym Andriushchenko @ NeurIPS
20 days
this Claude Code misuse case serves as strong motivation for our recent work on monitoring decomposition attacks:
Tweet card summary image
arxiv.org
Current LLM safety defenses fail under decomposition attacks, where a malicious goal is decomposed into benign subtasks that circumvent refusals. The challenge lies in the existing shallow safety...
@jcyhc_ai
John (Yueh-Han) Chen
21 days
Frontier AI labs should immediately apply the lightweight sequential monitors described in our paper, "Monitoring decomposition attacks in LLMs with lightweight sequential monitors." These attacks are already being used in active cyber-espionage campaigns.
2
2
14