Arkil Patel @arkil_patel X Profile

Arkil Patel

@arkil_patel

Followers

1K

Following

2K

Media

20

Statuses

226

CS PhD Student at Mila and McGill | Worked at AllenNLP and Microsoft Research

https://t.co/WlU8AXA3Xx

Montréal, Québec

Joined October 2016

Don't wanna be here? Send us removal request.

Arkil Patel

@arkil_patel

9 months

𝐓𝐡𝐨𝐮𝐠𝐡𝐭𝐨𝐥𝐨𝐠𝐲 paper is out! 🔥🐋 We study the reasoning chains of DeepSeek-R1 across a variety of tasks and settings and find several surprising and interesting phenomena! Incredible effort by the entire team! 🌐: https://t.co/CDlFHD28xQ

Sara Vera Marjanović

@saraveramarjano

9 months

Models like DeepSeek-R1 🐋 mark a fundamental shift in how LLMs approach complex problems. In our preprint on R1 Thoughtology, we study R1’s reasoning chains across a variety of tasks; investigating its capabilities, limitations, and behaviour. 🔗: https://t.co/Cyy18kYQ45

1

5

26

Mehar Bhatia

@bhatia_mehar

1 month

🚨How do LLMs acquire human values?🤔 We often point to preference optimization. However, in our new work, we trace how and when model values shift during post-training and uncover surprising dynamics. We ask: How do data, algorithms, and their interaction shape model values?🧵

2

49

124

Divyat Mahajan

@divyat09

2 months

[1/9] While pretraining data might be hitting a wall, novel methods for modeling it are just getting started! We introduce future summary prediction (FSP), where the model predicts future sequence embeddings to reduce teacher forcing & shortcut learning. 📌Predict a learned

10

47

221

Satwik Bhattamishra

@satwik1729

2 months

Excited to share our new work on the expressivity of Transformer-based multi-agent systems and understanding the trade-offs in communication, no. of agents, and achievable speedups ✨ Work led by @frisbeemortel; check out his thread for details!

Michael Rizvi-Martel

@frisbeemortel

2 months

Is there such a thing as too many agents in multi-agent systems? It depends! 🧵 Our work reveals 3 distinct regimes where communication patterns differ dramatically. More on our findings below 👇 (1/7)

0

4

13

Amirhossein Kazemnejad

@a_kazemnejad

2 months

It’s clear next-gen reasoning LLMs will run for millions of tokens. RL at 1M needs ~100× compute than 128K. Our Markovian Thinking keeps compute scaling linear instead. Check out Milad’s thread; some of my perspectives below:

Milad Aghajohari

@MAghajohari

2 months

Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵

19

93

896

Milad Aghajohari

@MAghajohari

2 months

Introducing linear scaling of reasoning: 𝐓𝐡𝐞 𝐌𝐚𝐫𝐤𝐨𝐯𝐢𝐚𝐧 𝐓𝐡𝐢𝐧𝐤𝐞𝐫 Reformulate RL so thinking scales 𝐎(𝐧) 𝐜𝐨𝐦𝐩𝐮𝐭𝐞, not O(n^2), with O(1) 𝐦𝐞𝐦𝐨𝐫𝐲, architecture-agnostic. Train R1-1.5B into a markovian thinker with 96K thought budget, ~2X accuracy 🧵

14

202

919

Arkil Patel

@arkil_patel

2 months

I’m at CoLM this week! Come check out our work on evaluating RMs for agent trajectories! These days, I’m thinking about forecasting generalization, scaling laws, and safety/adversarial attacks. Ping me if you wanna chat about research!

Xing Han Lu

@xhluca

2 months

i will be presenting AgentRewardBench at #COLM2025 next week! session: #3 date: wednesday 11am to 1pm poster: #545 come learn more about the paper, my recent works or just chat about anything (montreal, mila, etc.) here's a teaser of my poster :)

0

5

7

Satwik Bhattamishra

@satwik1729

2 months

Check out this new work on techniques for constructing Transformers for algorithmic tasks! Excited to have been part of this project!

Andy J Yang

@pentagonalize

2 months

We present The Transformer Cookbook: a collection of recipes for programming algorithms directly into transformers! Hungry for an induction head? Craving a Dyck language recognizer? We show you step-by-step how to cook up transformers for these algorithms and many more!

0

3

8

Marius Mosbach

@mariusmosbach

3 months

Here's a list of recommendations for what to do in Montreal during @COLM_conf and beyond. 👉: https://t.co/9ixyy2Y7Yl Many thanks to my co-authors @benno_krojer and @frisbeemortel.

github.com

A list of things to do in Montréal. Contribute to mmarius/montreal-things-to-do development by creating an account on GitHub.

Marius Mosbach

@mariusmosbach

3 months

Who will be at @COLM_conf ? I'm preparing a list of recommendations for what to do in beautiful Montreal. Stay tuned. 🥯

6

18

64

Mila - Institut québécois d'IA

@Mila_Quebec

3 months

Exciting news! We're thrilled to announce the appointment of Professor @hugo_larochelle as Mila's new Scientific Director! A deep learning pioneer and former head of Google's AI lab in Montreal, Hugo's leadership will be pivotal in advancing AI for the benefit of all. Read the

12

29

257

Nicholas Meade

@ncmeade

5 months

Come by our #ACL2025 poster tomorrow to discuss the safety risks surrounding increasingly capable instruction-following retrievers (or anything safety related)! 16:00-17:30 on Tuesday in Hall 4/5

Parishad BehnamGhader

@ParishadBehnam

5 months

Come and visit our poster on the Safety of Retrievers @aclmeeting 🗓️Tuesday, Findings Posters, 16:00-17:30 🚨Instruction-following retrievers will become increasingly good tools for searching for harmful or sensitive information.🚨

0

4

16

Parishad BehnamGhader

@ParishadBehnam

5 months

Come and visit our poster on the Safety of Retrievers @aclmeeting 🗓️Tuesday, Findings Posters, 16:00-17:30 🚨Instruction-following retrievers will become increasingly good tools for searching for harmful or sensitive information.🚨

Parishad BehnamGhader

@ParishadBehnam

9 months

Instruction-following retrievers can efficiently and accurately search for harmful and sensitive information on the internet! 🌐💣 Retrievers need to be aligned too! 🚨🚨🚨 Work done with the wonderful @ncmeade and @sivareddyg 🔗 https://t.co/yLJPiy1d0j Thread: 🧵👇

1

7

19

Marius Mosbach

@mariusmosbach

5 months

@aryopg Nice work! We observed a similar trend on certain math tasks in our work: https://t.co/hNlFcjKauc Section 4.1 has a discussion of our findings. You might want to consider citing it :) cc @saraveramarjano @arkil_patel @sivareddyg

0

5

13

Arkil Patel

@arkil_patel

5 months

If you’re at ICML and if you work on interpretability or causality, go talk to @_shruti_joshi_, she has a fantastic paper!

Shruti Joshi

@_shruti_joshi_

5 months

I will be at the Actionable Interpretability Workshop (@ActInterp, #ICML) presenting *SSAEs* in the East Ballroom A from 1-2pm. Drop by (or send a DM) to chat about (actionable) interpretability, (actionable) identifiability, and everything in between!

0

3

Siva Reddy

@sivareddyg

5 months

Come find us at the #ICML2025 poster if you are interested in safety of web agents!

Nicholas Meade

@ncmeade

5 months

I'll be at #ICML2025 this week presenting SafeArena (Wednesday 11AM - 1:30PM in East Exhibition Hall E-701). Come by to chat with me about web agent safety (or anything else safety-related)!

0

5

26

Arkil Patel

@arkil_patel

5 months

SafeArena is being presented at #ICML2025 !! Check out our poster and talk to @ncmeade for all things ‘safety ∪ agents ∪ LLMs’!

Nicholas Meade

@ncmeade

5 months

I'll be at #ICML2025 this week presenting SafeArena (Wednesday 11AM - 1:30PM in East Exhibition Hall E-701). Come by to chat with me about web agent safety (or anything else safety-related)!

0

1

11

Arkil Patel

@arkil_patel

6 months

Congrats @vernadankers!! We’re lucky to have you join our lab!

Tal Linzen

@tallinzen

6 months

Congratulations Verna! This was one of the best theses I've ever read, I highly recommend checking out Verna's work on the tradeoffs between memorization and generalization in language models!

0

5

Verna Dankers

@vernadankers

6 months

I miss Edinburgh and its wonderful people already!! Thanks to @tallinzen and @PontiEdoardo for inspiring discussions during the viva! I'm now exchanging Arthur's Seat for Mont Royal to join @sivareddyg's wonderful lab @Mila_Quebec 🤩

Agostina Calabrese 🦋

@agostina_cal

6 months

Huge congratulations to Dr. @vernadankers for passing her viva today! 🥳🎓 It's been an honour sharing the PhD journey with you. I wasn’t ready for the void your sudden departure left (in the office and in my life!). Your new colleagues are lucky to have you! 🥺🥰 @Edin_CDT_NLP

11

100

Xing Han Lu

@xhluca

6 months

"Build the web for agents, not agents for the web" This position paper argues that rather than forcing web agents to adapt to UIs designed for humans, we should develop a new interface optimized for web agents, which we call Agentic Web Interface (AWI).

9

59

197

Ziling Cheng @ EMNLP

@ziling_cheng

6 months

Do LLMs hallucinate randomly? Not quite. Our #ACL2025 (Main) paper shows that hallucinations under irrelevant contexts follow a systematic failure mode — revealing how LLMs generalize using abstract classes + context cues, albeit unreliably. 📎 Paper: https://t.co/YEK4TaI7pq 1/n

6

26

44

Kabir

@kabirahuja004

8 months

📢 New Paper! Tired 😴 of reasoning benchmarks full of math & code? In our work we consider the problem of reasoning for plot holes in stories -- inconsistencies in a storyline that break the internal logic or rules of a story’s world 🌎 W/ @melaniesclar, and @tsvetshop 1/n

3

54

262