Somnath Banerjee
@Somnath49347260
Followers
0
Following
3
Media
0
Statuses
3
IITan, Music Lover. “Dream is not that which you see while sleeping it is something that does not let you sleep.” ― A P J Abdul Kalam
India
Joined May 2020
New paper in hashtag #AAAI2025 #AlignmentTrack SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models Link: https://t.co/uHiq55OfmN Outcome of Microsoft Academic Partnership Grant. @RimaHazra2 @Somnath49347260 @cnerg
arxiv.org
Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through...
0
4
6
New paper in @icwsm'25 on "How (un)ethical are instruction-centric responses of LLMs?" Link: https://t.co/FT6R1QvsIR Among 131 submissions, 12 were accepted. Ours is one of these! Dataset: https://t.co/elVgp517ej
@RimaHazra2 @Somnath49347260 Sayan Layek.
arxiv.org
In this study, we tackle a growing concern around the safety and ethical use of large language models (LLMs). Despite their potential, these models can be tricked into producing harmful or...
0
1
9
🎉 Thrilled to announce that I have a paper accepted at EMNLP 2024 Main! 📷 📷 Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations ( https://t.co/9mYl3hR7j6)
#EMNLP2024 #NLProc #AISafety #sutdsg
arxiv.org
Ensuring the safe alignment of large language models (LLMs) with human values is critical as they become integral to applications like translation and question answering. Current alignment methods...
2
3
16