Tahmid Rahman
@tahmedge
Followers
238
Following
8K
Media
36
Statuses
963
Senior Applied Scientist (NLP & ML) @ Dialpad
Toronto, Canada
Joined November 2016
Introducing RL Visualizer See PPO and GRPO mentioned everywhere but don't know what actually makes them different? Visualize and compare these algorithms in a simple online maze environment! 🚀
11
128
839
LLM as a judge has become a dominant way to evaluate how good a model is at solving a task, since it works without a test set and handles cases where answers are not unique. But despite how widely this is used, almost all reported results are highly biased. Excited to share our
46
176
1K
If you're at EMNLP 2025, do catch the poster presentation of one of my works (as a first author) on Friday at 7 pm. I am missing out on EMNLP this year since I'm presenting at IEEE VIS.
@emnlpmeeting / #EMNLP2025 Accepted Paper: From Charts to Fair Narratives: Uncovering and Mitigating Geo-Economic Biases in Chart-to-Text 📝 Paper: https://t.co/pi7XC1djQx This paper presents the first large-scale investigation of geo-economic biases in Vision-Language Models
0
1
2
Not attending #EMNLP2025 in person this time. Those who are interested in LLM research (from training to evaluation to application) can check out our papers.
0
0
8
I'm in Vienna to present our paper at IEEE VIS 2025. If you're attending, be sure to catch it tomorrow at Hall E, 11.45 am. Check out the paper at: https://t.co/svn4p6Omzy
#IEEEVIS2025
arxiv.org
Information visualizations are powerful tools that help users quickly identify patterns, trends, and outliers, facilitating informed decision-making. However, when visualizations incorporate...
Excited to announce that our paper has been selected for a Best Paper Award at IEEE VIS 🏆 I would like to extend my gratitude to my co-authors, specifically to my supervisor Dr. @Enamul_Hoque . This achievement would not have been possible without their support. #IEEEVIS2025
0
1
1
🚀 Introducing LongCat-Flash-Omni — a 560B-parameter (27B activated) open-source omni-modal MoE model, excelling at real-time audio-visual interaction. Built on LongCat-Flash’s high-performance shortcut-connected MoE architecture with zero-computation experts, plus efficient
4
23
196
Excited about multimodal LLMs for visualization? Join my #MLLM4Vis tutorial at @ieeevis — Mon, Nov 3 · 09:00–12:30 (Room 1.61 + 1.62)! We’ll explore vision-language models, chart reasoning & agentic systems and more. 🔗 https://t.co/iuNeKkehIE
#IEEEVIS
1
9
26
🚀We are excited to introduce the Tool Decathlon (Toolathlon), a benchmark for language agents on diverse, complex, and realistic tool use. ⭐️32 applications and 600+ tools based on real-world software environments ⭐️Execution-based, reliable evaluation ⭐️Realistic, covering
6
28
168
This paper asks when LLMs can be trusted to judge mental health replies. Found that LLMs systematically overrate replies, especially on empathy and helpfulness. Even when the ranking order matched human experts, the actual scores were too high, which means models look better
6
18
85
New #NeurIPS2025 paper: how should we evaluate machine learning models without a large, labeled dataset? We introduce Semi-Supervised Model Evaluation (SSME), which uses labeled and unlabeled data to estimate performance! We find SSME is far more accurate than standard methods.
16
36
248
Our paper “Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models” has been accepted to EMNLP 2025 (Industry Track)! 🎉
ChartJudge-2B accepted to EMNLP 2025 Industry Track! Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices Paper: https://t.co/oit8t2gmqw ChartJudge-2B: a 2B-parameter model fine-tuned on synthetic judgments that matches 7B
0
1
3
Today, @GoogleResearch announced DeepSomatic, a new machine learning model developed with our partners, including @ucscgenomics and @ChildrensMercy, that accurately identifies genetic variants in cancer cells — a critical step for delivering more precise treatments for patients.
blog.google
An overview of DeepSomatic, a new AI tool that helps identify complex genetic variants in cancer cells.
95
277
2K
3. DACP (EMNLP 2025 NewSumm Workshop) — Domain-adaptive continual pre-training for summarizing phone conversations at scale.
0
0
0
2. DACIP-RC (EMNLP 2025 Industry Track) — Domain-adaptive continual instruction pre-training via reading comprehension on real business conversations.
1
0
0
1. AI Knowledge Assist (EMNLP 2025 Industry Track) — An automated pipeline to build high-quality knowledge bases for conversational AI agents.
1
0
0
We’re pushing the frontier of LLMs at Dialpad, from pre-training to real-world agentic AI systems. Check the preprints of our team's recently accepted #EMNLP2025 papers on LLM pre-training and agentic AI for real-world use-cases. #GenerativeAI #LLM #AgenticAI #NLP #Dialpad
1
0
3
ChartJudge-2B accepted to EMNLP 2025 Industry Track! Deploying Tiny LVLM Judges for Real-World Evaluation of Chart Models: Lessons Learned and Best Practices Paper: https://t.co/oit8t2gmqw ChartJudge-2B: a 2B-parameter model fine-tuned on synthetic judgments that matches 7B
1
2
6
I spoke to Saleh yesterday. We were working on getting more fuel for bulldozers to clear out the streets. Israel has now funded and armed actual terrorist groups to cause havoc across Gaza. And they are acting on it. They likely ordered this assassination.
🚨 BREAKING: Palestinian journalist and activist Saleh al-Ja’frawi has been confirmed killed in Gaza. Al-Ja’frawi had previously received direct Israeli threats as part of a campaign targeting journalists who exposed the Israeli army’s crimes during the war. Initial reports
858
3K
10K
If true/confirmed, horrific and heartbreaking. He documented the genocide for two long years, while being smeared and lied about by pro Israel people. To be killed now, on the verge of a possible end to the genocide, so awful.
Journalist Saleh Al-Ja'frawi has reportedly been killed in the Al-Sabra neighborhood of Gaza City. He was known for documenting Gaza's pain and resilience through his powerful visuals and words - a voice of truth in one of the world's most dangerous places for journalists.
1K
7K
23K