
Anton Xue
@AntonXue
Followers
224
Following
220
Media
2
Statuses
119
Computer Science PhD Student @ UPenn Machine Learning + Formal Methods
Joined January 2015
RT @ThomasTCKZhang: I’ll be presenting our paper “On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning” a….
0
9
0
RT @aaditya_naik: Swing by our poster session today at 11 if you're at ICML to learn more about speeding up neurosymbolic learning! We will….
0
3
0
RT @LarsLindemann2: Our book “Formal Methods for Multi-Agent Feedback Control Systems” - which is uniquely situated at the intersection of….
0
5
0
This is today at #ICLR2025.
Excited to present our paper on a logic-based perspective of LLM jailbreaks with @Avishreekh at @ICLR_conf this Saturday, April 26!. Poster #268 in Hall 3+2B at 15:00 Singapore time.📄 arXiv: 🔗 Blog: \begin{thread}.
0
0
9
Turns out that such "if-then" rules can be effectively modeled in Horn logic. Modeling rule-following as logical inference gives a precise characterization that correct rule-following is "maximal, monotone, and sound". More:
en.wikipedia.org
1
0
0
Excited to present our paper on a logic-based perspective of LLM jailbreaks with @Avishreekh at @ICLR_conf this Saturday, April 26!. Poster #268 in Hall 3+2B at 15:00 Singapore time.📄 arXiv: 🔗 Blog: \begin{thread}.
debugml.github.io
We study jailbreak attacks through propositional Horn inference.
1
5
20
RT @AlexRobey23: A few days ago, we dropped 𝗮𝗻𝘁𝗶𝗱𝗶𝘀𝘁𝗶𝗹𝗹𝗮𝘁𝗶𝗼𝗻 𝘀𝗮𝗺𝗽𝗹𝗶𝗻𝗴 🚀. and we've gotten a little bit of pushback. But whether you'….
0
8
0
I am proud to announce that I have concluded #NeurIPS2024 ranked 15th on the Whova points leaderboard. I could not have done this without my brilliant collaborators who gave me the courage and strength to grind through 170+ community polls.
1
2
20
RT @AlexRobey23: In around an hour (at 3:45pm PST), I'll be giving a talk about jailbreaking LLM-controlled robots at the AdvML workshop at….
0
4
0
📝 Blog: 🧐 arXiv: 🤖 Code:
github.com
Contribute to AntonXue/tf_logic development by creating an account on GitHub.
1
0
2
I'll present some logic-based perspetives on LLM jailbreaks at these #NeurIPS2024 workshops:.* New Frontiers in Adversarial Machine Learning (East Ballroom C).* Towards Safe & Trustworthy Agents (West Ballroom C).* Scientific Methods for Understanding Neural Networks (West 205).
2
2
16