Jwala Dhamala
@jwaladhamala
Followers
258
Following
592
Media
3
Statuses
154
Scientist at Alexa AI-NU (she/her) #MachineLearning #NLProc #Fairness #RobustAI #DeepLearning #UncertaintyQuantification
Boston, MA
Joined June 2011
Gen AI is reshaping how software is built. The 2026 #AmazonNovaAIChallenge invites university teams to advance trusted, agentic AI: systems that code, test & deploy safely. Applications open Nov 10, 2025:
amazon.science
Challenge pushes teams to demonstrate measurable gains in secure-coding performance while building AI agents that advance real-world utility and reliability at scale.
1
3
9
Excited to share our collaborative work on “simulating and understanding deceptive behaviors of LLM in long-horizon interaction” https://t.co/cew9GdZ3bj
#ResponsibleAI #AIResearch #Deception @AmazonScience
arxiv.org
Deception is a pervasive feature of human communication and an emerging concern in large language models (LLMs). While recent studies document instances of LLM deception under pressure, most...
Deception is one of the most concerning behaviors that advanced AI systems can display. If you are not concerned yet, this paper might change your view. We built a multi-agent framework to study: 👉 How deceptive behaviors can emerge and evolve in LLM agents during realistic
0
0
2
Deception is one of the most concerning behaviors that advanced AI systems can display. If you are not concerned yet, this paper might change your view. We built a multi-agent framework to study: 👉 How deceptive behaviors can emerge and evolve in LLM agents during realistic
17
49
224
Rigorous agents evaluation starts with robust benchmarks that can accurately measure an agent's capabilities and flaws. Check out our recent work led by @daniel_d_kang, @maxYuxuanZhu on best practices for building rigorous agentic benchmarks:
As AI agents near real-world use, how do we know what they can actually do? Reliable benchmarks are critical but agentic benchmarks are broken! Example: WebArena marks "45+8 minutes" on a duration calculation task as correct (real answer: "63 minutes"). Other benchmarks
0
0
0
✨Interested in becoming a reviewer for the fifth TrustNLP workshop at #NAACL2025? Sign up using the form 👇 https://t.co/f6ULyoHu5B
docs.google.com
Thanks for your interest in being a reviewer for the Trustworthy NLP Workshop @ NAACL 2025! Important Dates February 7, 2025: Workshop Paper Due Date (Direct Submission via OpenReview) February 20,...
0
9
8
We are proud to announce the AMI Dataset, a benchmark for fine-grained insect identification in the wild - published this week at ECCV and the product of a global consortium of computer scientists and entomologists. https://t.co/4d70PnWtWx 🧵
4
13
47
📢Please join us tomorrow for the #NAACL2024 @trustnlp workshop, starting at 9AM in Don Alberto 4 ~ 📅 Workshop schedule found below! 🌐But if you can't make it, dont worry - TrustNLP accepted papers are now up on our website, check it out😊 https://t.co/qXIaevtR9K
0
12
19
📢 NAACL Findings are welcome to submit (non-archival) to the #NAACL2024 @trustnlp workshop! Accepted papers will be presented alongside archival submissions. If interested, please fill out the form below ⬇️ ! 📝 Form: https://t.co/2nlxuxPQoB 🗓️ Deadline: May 1 2024
0
10
17
I will be at ICML for co-organizing this workshop event. Feel free to reach out if you want to chat or make new friends 😊
There's only 10 days left until our workshop! We can't wait to see you all there 🤩 Check our website for the latest schedule and details of the talks: https://t.co/TJV69acHBA Don't miss out on our amazing lineup of speakers who will share you the latest in conversational AI!
0
7
15
@WilliamBarrHeld and I will be presenting Multi-VALUE in-person today at #ACL2023 with @JingfengY, @jwaladhamala, @Diyi_Yang. Time: TODAY (July 12th) 11:00-12:30 ET MiniConf: https://t.co/uvpgpbTlFQ Use Multi-VALUE to eval dialect performance gaps + increase dialect robustness!
Multi-VALUE is a toolkit to evaluate and mitigate performance gaps in NLP systems for multiple English dialects. We release scalable tools for introducing language variation, which you can use to stress test your models and increase their robustness https://t.co/WpXtpsiyT2 🧵
0
8
18
Introducing our second speaker at @trustnlp #ACL2023NLP Ramprasaath is a Scientist at Apple TDG (Technology Development Group). Prior to this, he was a Sr. Research Scientist at Salesforce.
0
1
3
Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models https://t.co/XiMAdWiYXe by @NinarehMehrabi et al. including @aram_galstyan, @Rahul_tragu
#Computation #Language
deepai.org
11/17/22 - Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambig...
0
3
2
I’m at #ACL2023 this week. @uclanlp members 🐻 and my collaborators at @AmazonScience will present the following papers at the conferences on the topics around trustworthy NLP, vision-language, and language+reasoning. See details at https://t.co/j6qjI67ljj 🧵👇
3
22
127
Excited to announce our third speaker @trustnlp 2023 #ACL2023NLP Rachel Rudinger (@rachelrudinger ) is an Assistant Professor in the Department of Computer Science at the University of Maryland, College Park.
0
6
27
Revealing our first speaker at TrustNLP #ACL2023 Hal Daumé III (@haldaume3) is a Volpi-Cupal Endowed Professor in Computer Science and Language Science at the University of Maryland; he has a joint appointment as a Senior Principal Researcher at Microsoft Research.
0
1
6
TrustNLP 2023 @ACL is providing grants this year too. Don't forget to apply.
0
0
1
We have extended the deadline to June 22. Please apply if you haven't already.
0
1
0
We are excited to be back this year at ACL for the Third Edition of the Workshop on Trustworthy NLP (TrustNLP). Stay tuned for more updates!
0
2
3