Shubhendu Trivedi
@_onionesque
Followers
9K
Following
25K
Media
2K
Statuses
23K
Cultivated Abandon. Twitter interests: Machine learning research, applied mathematics, mathematical miscellany, ML for physics/chemistry, books.
New York, NY / Cambridge, MA
Joined October 2008
Results seem reminiscent of those in this excellent thesis (quite a bit of it seems to have been motivated originally by studying misspecification in SBI).
0
0
2
Haven't dug into it carefully, but curious about this.
arxiv.org
This paper introduces a novel methodology for constructing multiclass ROC curves using the multidimensional Gini index. The proposed methodology leverages the established relationship between the...
0
0
3
It's emblematic of that entire school of combinatorics. "Every graph w/ min degree ≥3 contains a cycle whose length is a power of 2" seems elementary, but also maps to a v. specific structure for a minimal hypothesis on the graph. It looks checkable, but also grows explosively.
0
0
1
Talking of Erdős conjectures, my favourite is actually this ( https://t.co/pohocFUOUm it's a trap for computer searches), despite its humble and insignificant seeming (i.e. stamp collecting) appearance. It's not what one might think (Erdős conjecture on arithmetic progressions).
en.wikipedia.org
2
0
2
We discovered that language models leave a natural "signature" on their API outputs that's extremely hard to fake. Here's how it works 🔍 📄 https://t.co/Yc7mnhZS96 1/
arxiv.org
The ubiquity of closed-weight language models with public-facing APIs has generated interest in forensic methods, both for extracting hidden model details (e.g., parameters) and for identifying...
3
22
115
To be clear, I think this might already be possible to some extent with creative searching (+ expertise to guide search/recognize matches). This is because on going through the Erdős list, some of the matches, even though ~ from the same area, already seem quite non-trivial.
0
0
2
Everyone is headed somewhere. Whether it’s toward a promotion, a new career path, or your next degree, Grace College allows you to achieve your goals in any field! Head toward your goals without actually going anywhere.
0
0
1
While the search around Erdős problems is already very impressive, one could say that a clear indicator of a step up (or many steps up) in model capabilities would be when we start hearing of them being able to search for similar solutions across sub-fields of mathematics.
1
0
5
Suggests a target for an ML version of mathematical creativity: an LLM capable of locating “analogies between analogies” (for example, reducing problems in graph theory to problems in topology, and in doing so, uncovering existing but unrecognized solutions across fields).
Update: Mehtaab and I pushed further on this. Using thousands of GPT5 queries, we found solutions to 10 Erdős problems that were listed as open: 223, 339, 494, 515, 621, 822, 883 (part 2/2), 903, 1043, 1079. Additionally for 11 other problems, GPT5 found significant partial
2
0
6
A brief thread on said book:
To motivate the book further: Niyogi here studies language dynamics at a macroscopic level, and uses tools from learning theory and statistical physics for the purpose. Language acquisition is seen as a mapping onto some "grammar", which uses some specific learning mechanism. >>
0
0
1
A paper I got reminded of by Mircea Petrache. This thread is a bit too neat, but neat regardless. Partha Niyogi had a short, but perceptive, discussion of this stuff in his language evolution book, a lot of which hasn't quite been followed through.
1
0
3
The question, however, is how to scale them efficiently. This is quite an interesting technical challenge.
0
0
1
Interesting study around scaling and equivariance
arxiv.org
We present an empirical study in the geometric task of learning interatomic potentials, which shows equivariance matters even more at larger scales; we show a clear power-law scaling behaviour...
1
6
28
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
663
3K
24K
A new library for Kohonen self-organizing maps
github.com
TorchSOM is a PyTorch-based library for training Self-Organizing Maps (SOMs), a model trained in an unsupervised manner, that can be used for clustering, dimensionality reduction and data visualiza...
0
0
7
🚨 Attention aspiring PhD students 🚨 Meta / FAIR is looking for candidates for a joint academic/industry PhD! Keywords: AI for Math & Code. LLMs, RL, formal and informal reasoning. You will be co-advised by prof. @Amaury_Hayat from ecole des ponts and yours truly. You'll have
24
120
898
Proof the above is not just talk: I used it as a baseline in a 2014 UAI paper. It somehow never managed to escape containment. Perhaps due to poor integration into popular libraries, somewhat weird conceptual positioning, and sociological factors around its origin and diffusion.
1
0
3
The Relief algorithm (and its variants like ReliefF, RReliefF) for feature selection and importance is quite powerful (read useful), compared to the usual filter/wrapper methods anyway, and yet I've met exactly one person who had heard of it beforehand.
3
0
8
🧵Our paper "A Compressive-Expressive Communication Framework for Compositional Representations" was accepted at NeurIPS! @fidelrio @MirceaSci @denisparra
https://t.co/v4RtiSxujk
arxiv.org
Compositional generalization--the ability to interpret novel combinations of familiar elements--is a hallmark of human cognition and language. Despite recent advances, deep neural networks still...
1
3
9