
Benjamin Van Durme
@ben_vandurme
Followers
1K
Following
212
Media
5
Statuses
165
From now on in my advising meetings, any negative result will be met with my response of "think deeper".
We significantly increased the rate limits to reasoning model by popular demand. If correctness is really important for you ask the model to “think deeper” or select “gpt5 thinking” in the model picker, this uses a higher reasoning effort than when you are auto switched to.
1
2
24
Ettin, a two-headed giant . language model.
en.wikipedia.org
Special thanks to @jhuclsp for amazing collaborators Kathryn Ricci @ruyimarone @ben_vandurme @lawrie_dawn, and LightOn with @antoine_chaffin!. And this project wouldn't exist without the efforts of ModernBERT (@benjamin_warner @bclavie @jeremyphoward, many more) so 🙏 them also.
0
3
9
Will continues to drive great work in the modular use of adapters. From security benefits in AdapterSwap to RE-adapting to the COLM '25 SpectR that enables this new result LAG.
arxiv.org
Training large, general-purpose language models poses significant challenges. The growing availability of specialized expert models, fine-tuned from pretrained models for specific tasks or...
Check out the paper w/@ben_vandurme now on arXiv:.
0
0
4
RT @JohnCLangford: A new opening for multimodal model research: . Please apply if interested.
0
10
0
RT @EYangTW: 🚨Wouldn’t it be nice if your agentic search system could reason over all your docs?. ✨Introducing Rank-K, a listwise reranker….
0
28
0
RT @satyanadella: 2. Copilot Tuning: Copilot can now learn your company’s unique tone and language. It is all about taking that expertise y….
0
57
0
RT @willcfleshman: 🚨 Our latest paper is now on ArXiv! 👻.(w/ @ben_vandurme). SpectR: Dynamically Composing LM Experts with Spectral Routing….
0
12
0
RT @alexdmartin314: Wish you could get a Wikipedia style article for unfolding events?. Introducing WikiVideo: a new multimodal task and be….
0
13
0
RT @mustafasuleyman: You can't just be right, you have to know you're right. Good advice for LLMs, according to new Johns Hopkins research.….
0
46
0
See here for more details:.Code and models to be released soon as part of a further announcement. w/ Vivek Chari and @hiaxui.
arxiv.org
Sequence-to-sequence tasks often benefit from long contexts, but the quadratic complexity of self-attention in standard Transformers renders this non-trivial. During generation, temporary...
0
1
7
This follows our earlier compression work:.• CCoT • Speech Decoders • Text Decoders • Text Encoders • Propositions!
aclanthology.org
Rachel Rudinger, Kevin Duh, Benjamin Van Durme. Proceedings of the 12th International Conference on Computational Semantics (IWCS) — Short papers. 2017.
1
1
3
RT @zhengping_jiang: 1/ 🚨LLMs can still be factual even when they don’t know the full answer!🚨 Introducing Conformal Linguistic Calibration….
0
16
0
RT @orionweller: Ever wonder how test-time compute would do in retrieval? 🤔. introducing ✨rank1✨. rank1 is distilled from R1 & designed for….
0
38
0
RT @harsh_jhamtani: With PeopleJoin, our new benchmark, we study LM agents as coordinators to gather distributed insights and empower colla….
0
3
0