
Anuj Diwan
@anuj_diwan
Followers
761
Following
2K
Media
21
Statuses
240
PhD Student @UTCompSci. Prev. Student Researcher @GoogleDeepmind, FAIR (@metaai), @AdobeResearch. 2021 BTech CSE @iitbombay. Interests: NLP, ASR, ML. 🇮🇳🇺🇸
Austin + Mumbai
Joined May 2014
RT @ZEYULIU10: LLMs trained to memorize new facts can’t use those facts well.🤔. We apply a hypernetwork to ✏️edit✏️ the gradients for fact….
0
61
0
RT @ForbesIndia: A pioneer in machine learning, Sunita Sarawagi has transformed how computers process unstructured data through innovations….
0
5
0
RT @ForbesIndia: Preethi Jyothi is advancing speech and language technologies to make AI more inclusive for low-resource Indian languages.….
0
5
0
RT @EliasEskin: Extremely excited to announce that I will be joining @UTAustin @UTCompSci in August 2025 as an Assistant Professor! 🎉. I’m….
0
66
0
RT @ManyaWadhwa1: Evaluating language model responses on open-ended tasks is hard! 🤔. We introduce EvalAgent, a framework that identifies n….
0
35
0
RT @ramya_namuduri: Have that eerie feeling of déjà vu when reading model-generated text 👀, but can’t pinpoint the specific words or phrase….
0
17
0
RT @Jess_Riedel: Scott Aaronson announces he's building an Open-Phil backed AI alignment group at UT Austin. (🔗 below.). Prospective postd….
0
44
0
RT @PuyuanPeng: Announcing the new SotA voice-cloning TTS model: 𝗩𝗼𝗶𝗰𝗲𝗦𝘁𝗮𝗿 ⭐️. VoiceStar is . - autoregressive, . - voice-cloning, . - robu….
0
61
0
RT @mina1004h: Recent AI models can suggest endless video edits, offering many alternatives to video creators. But how can we easily sift t….
0
20
0
If you'd like an open-source text-to-speech model that follows your style instructions, consider using our ParaSpeechCaps-based model!.Model: Paper:
Three new state-of-the-art audio models in the API:. 🗣️ Two speech-to-text models—outperforming Whisper.💬 A new TTS model—you can instruct it *how* to speak. 🤖 And the Agents SDK now supports audio, making it easy to build voice agents. Try TTS now at
1
5
42
RT @ai4bharat: 🚀 AI4Bharat: Advancing Indian Language AI - Open & Scalable! 🇮🇳✨. Over the past 4 years, we at AI4Bharat have been on a miss….
0
90
0
RT @berraksismann: Exciting News!😊INTERSPEECH 2028 will take place at the River Walk in San Antonio, Texas! ✨ I’m honored to serve as one o….
0
10
0
RT @ArxivSound: ``Scaling Rich Style-Prompted Text-to-Speech Datasets,'' Anuj Diwan, Zhisheng Zheng, David Harwath, Eunsol Choi, https://t.….
0
3
0
Thanks to my amazing collaborators @zszheng147, @eunsolc and David Harwath!.Paper: Code: Dataset: Model: Demo: HF Space:
0
0
6
RT @brunchavecmoi: Can we generate long text from compressed KV cache? We find existing KV cache compression methods (e.g., SnapKV) degrade….
0
27
0