Stéphane d'Ascoli Profile
Stéphane d'Ascoli

@stephanedascoli

Followers
1,048
Following
184
Media
19
Statuses
96

Research Scientist @AIatMeta working on neural decoding. Prev: AI4science fellow @EPFL , PhD & physics @ENS_ULM , astro @NASA .

Paris, France
Joined November 2018
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@stephanedascoli
Stéphane d'Ascoli
1 year
Think Transfomers are terrible at logical reasoning? Think again 💥 In this collaboration with Samy Bengio, @jsusskin (Apple) & Emmanuel Abbé (EPFL), we show that when trained with Boolean inputs and symbolic outputs, they become very powerful 🧠 🧵⤵️
Tweet media one
8
122
612
@stephanedascoli
Stéphane d'Ascoli
2 years
Today I met machine learning’s number one enemy #spuriouscorrelations
Tweet media one
13
39
606
@stephanedascoli
Stéphane d'Ascoli
1 year
Thrilled to announce that I will be joining @MetaAI next month as a Research Scientist 😍 I will be working in the Brain & AI team on decoding language from neural activity, to hopefully help those which have difficulties to speak or type. Learn more here:
12
12
274
@stephanedascoli
Stéphane d'Ascoli
1 year
🚨 ODEFormer is on Arxiv! We show that Transformers can recover the differential equations governing dynamical systems from noisy & irregularly sampled trajectories. Very fun collaboration with @SorenBecker , @TrackingPlumes , @pschwllr & @k__niki ! 🧵⤵️
Tweet media one
3
52
249
@stephanedascoli
Stéphane d'Ascoli
3 years
CNNs are more sample-efficient, ViTs are more powerful. Can we get the best of both worlds? Check out our paper accepted at @icmlconf Thanks to my collaborators @HugoTouvron @leavittron @arimorcos @leventsagun @GiulioBiroli 🧵⤵️
Tweet media one
3
28
176
@stephanedascoli
Stéphane d'Ascoli
2 years
After a couple great years at @MetaAI and @ENS_ULM , I will be starting as @AI4ScienceEPFL fellow next month 😍 Can’t wait to leverage modern AI tools with biologists, neuroscientists, chemists and physicists 🧬🧠🧪🔭 If you work at @EPFL and want to meet up, please reach out !
4
3
85
@stephanedascoli
Stéphane d'Ascoli
3 years
1,2,3,5,8,13… What is the next term ? This kind of question is typical in IQ tests, but has received little attention in AI. We had great fun training Transformers to tackle this problem, check out our paper and our online demo :
@GuillaumeLample
Guillaume Lample @ ICLR 2024
3 years
Deep Symbolic Regression for Recurrent Sequences -- We show that transformers are great at predicting symbolic functions from values, and can predict the recurrence relation of sequences better than Mathematica. You can try it here:
Tweet media one
Tweet media two
Tweet media three
25
162
733
5
16
74
@stephanedascoli
Stéphane d'Ascoli
3 years
New preprint : . When and how should you decay your learning rate ? We give some theoretical insights on this crucial question in our latest work with @MariaRefinetti and @GiulioBiroli . (1/3)
Tweet media one
3
11
40
@stephanedascoli
Stéphane d'Ascoli
1 year
We hope this work can be applied to other fields in science and spark more research on symbolic reasoning in LLMs. We release our code & models publicly and provide a pip package & interactive Colab demo! A few attention maps for the pleasure of the eye:
Tweet media one
Tweet media two
0
2
40
@stephanedascoli
Stéphane d'Ascoli
1 year
The so-called "Boolformer" takes as input a set of N (x,y) pairs in {0,1}^D x {0,1}, and tries to predict a Boolean formula which approximates these observations. Here are two very simple examples: addition and multiplication of 2-bit numbers.
Tweet media one
Tweet media two
1
6
39
@stephanedascoli
Stéphane d'Ascoli
3 years
Great video by @ykilcher on our symbolic regression paper with @pa_kamienny @GuillaumeLample @f_charton . Watch until the end to discover his musical skills 😅
@ykilcher
Yannic Kilcher 🇸🇨
3 years
📜Paper Video Time!📜Today I'm talking to Stéphane d'Ascoli ( @stephanedascoli ) about Deep Symbolic Regression for Recurrent Sequences. This model is given a sequence of numbers, like 1, 2, 3, 5, 8 and it figures out the *rule behind* the sequence. Insane🤯
Tweet media one
4
50
220
0
5
31
@stephanedascoli
Stéphane d'Ascoli
3 years
Cet après-midi, mon ami Arthur et moi avions l’honneur d’être invités par Étienne Klein sur France Culture dans la Conversation Scientifique. Une émission consacrée pleinement à notre sujet préféré, l’espace-temps et sa courbure : a (ré)-écouter ici !
1
0
24
@stephanedascoli
Stéphane d'Ascoli
1 year
We apply the Boolformer to a set of classification tasks from PMLB, ranging from predicting chess moves to diagnosing horse colic. Our model achieves similar performance to classic ML methods, while outputting interpretable Boolean formulas!
Tweet media one
Tweet media two
1
1
25
@stephanedascoli
Stéphane d'Ascoli
4 years
Vos semaines surchargées et vos fins de mois difficiles ne sont plus une excuse pour ne pas vous intéresser à l’IA ! Avec ce nouveau livre aussi concis que bon marché, découvrez une nouvelle notion sur l’IA chaque jour entre deux arrêts de métro !
Tweet media one
0
3
20
@stephanedascoli
Stéphane d'Ascoli
1 year
We also applied the Boolformer to the task of gene regulatory network inference, which is central in biology. On a recent benchmark, our model is competitive with state-of-the-art genetic algorithms for Boolean modelling, while running several orders of magnitude faster!
Tweet media one
1
1
19
@stephanedascoli
Stéphane d'Ascoli
4 years
Double descent has recently become popular in deep learning, but a similar curve was observed in the 1990s for least squares. Wonder if these kinds of overfitting are the same ? Come and see our Spotlight at #NeurIPS2020 and chat in the poster session !
Tweet media one
0
4
13
@stephanedascoli
Stéphane d'Ascoli
1 year
🧑‍🔬 We hope our method can guide the intuition of domain experts in many fields of the natural sciences. To facilitate this, we released ODEFormer & ODEBench publicly and built a pip package & interactive demo to help get started:
0
1
13
@stephanedascoli
Stéphane d'Ascoli
2 years
Yeah James Webb is nice, but did you know that you can produce these kind of pictures using just… an iPhone (with a perfect night, long exposure and a bit of postprocessing) ?! Taken in Pumalin national park, southern Chile.
Tweet media one
Tweet media two
Tweet media three
1
1
9
@stephanedascoli
Stéphane d'Ascoli
1 year
📈 Given the limitations of the "Strogatz" benchmark for this task, we introduce ODEBench, a more extensive collection of dynamical systems curated from the literature. On both benchmarks, ODEFormer achieves SOTA, with fast inference and impressive robustness to noise!
Tweet media one
Tweet media two
1
0
10
@stephanedascoli
Stéphane d'Ascoli
2 years
Could Transformers become the gold standard for symbolic (and even non-symbolic) regression ? Check out our latest paper to find out !
@f_charton
François Charton
2 years
Our new paper on Symbolic Regression with @pa_kamienny @stephanedascoli @GuillaumeLample is now on Arxiv ! We achieve performance comparable to SOTA genetic algorithms on SRBench with Transformers, whose inference time is orders of magnitude lower! 1/4
Tweet media one
3
27
117
0
1
7
@stephanedascoli
Stéphane d'Ascoli
4 years
Le livre sur la relativité que nous avons écrit avec Arthur Touati est enfin entre nos mains ! Si vous avez aimé Interstellar et vous voulez vous replonger la tête dans les étoiles, n’hésitez pas à le pré-commander ici : Sortie officielle le 25 mars 🚀🛰🧑‍🚀
1
0
7
@stephanedascoli
Stéphane d'Ascoli
3 years
In convex problems, the best is to decay as 1/time. What about non-convex problems? For random gaussian losses on the sphere, we show that the optimal decay rate is smaller than one (0.5 in plot below). This could explain why the inverse square root schedule is so popular! (2/3)
Tweet media one
1
1
7
@stephanedascoli
Stéphane d'Ascoli
3 years
🚀The ConViT benefits from a vastly increased sample efficiency, without any sacrifice in terms of maximal performance. We hope this model will spark more exploration of "soft" inductive biases, which make learning easier, but vanish away when not needed!
Tweet media one
1
0
5
@stephanedascoli
Stéphane d'Ascoli
3 years
We then study inference problems, where two phases emerge: a search phase, followed by a convergence phase once the signal is detected. Here, the optimal schedule is to keep a large constant learning rate to speed up the search, then decay as 1/time once in a convex basin. (3/3)
Tweet media one
1
1
5
@stephanedascoli
Stéphane d'Ascoli
4 years
Tweet media one
0
0
5
@stephanedascoli
Stéphane d'Ascoli
4 years
@NicolasToueille @Aurelie_JEAN @ylecun Merci beaucoup ! Un honneur d’être parmi d’aussi grands noms 😇
1
0
4
@stephanedascoli
Stéphane d'Ascoli
2 years
Très honoré que notre Voyage au Cœur de l’Atome soit récompensé ainsi ! 🤩 Merci @PrixROBERVAL , @badry96 et @editionsfirst
@PrixROBERVAL
Prix ROBERVAL
2 years
Retour sur la cérémonie du Prix Roberval: Bravo à la lauréate de la catégorie Grand Public Aline Richard Zivohlava pour son œuvre “La Saga CRISPR” et au coup de cœur des médias de la catégorie Grand Public “Voyage au cœur de l’atome” d’Adrien Bouscal et Stéphane d’Ascoli !
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
2
8
1
1
4
@stephanedascoli
Stéphane d'Ascoli
3 years
💡The ConViT uses Gated Positional Self-Attention (GPSA) layers, which are initialized to mimick convolutions, then let each attention head learn more complex relationships through a learnable gating parameter.
Tweet media one
1
0
4
@stephanedascoli
Stéphane d'Ascoli
3 years
🧑‍🔬Hybrid models are a good compromise, but optimal architecture is very task-dependent. What if we let each layer decide whether to perform convolutions or self-attention? This is the idea behind the ConViT, an “adaptive” hybrid model!
Tweet media one
1
0
3
@stephanedascoli
Stéphane d'Ascoli
2 years
@np_hard Yes it is, Chiloe island ! How on earth did you recognise ?
1
0
2
@stephanedascoli
Stéphane d'Ascoli
1 year
@sirbayes @jsusskin Not directly with this model (doesn’t have numbers in its vocabulary), but we considered real-valued inputs in previous work on SR, both for 1D recurrent sequences () and multidimensional point clouds () 🙂
1
0
2
@stephanedascoli
Stéphane d'Ascoli
2 years
@KrzakalaF Haha thanks, that’s the next level after forgetting email attachments 😂
0
0
1
@stephanedascoli
Stéphane d'Ascoli
1 year
@sirbayes @jsusskin Would be cool to try and build a multimodal symbolic Transformer !
0
0
1
@stephanedascoli
Stéphane d'Ascoli
2 years
Come visit us at @icmlconf on Wednesday evening !
@f_charton
François Charton
2 years
The source code for our ICML 2022 paper Deep Learning for Recurrent Sequences () is now available on . Spotlight: Wednesday 20, 16:50 ET Poster session: Wednesday 20, 18:30 ET @stephanedascoli @pa_kamienny @GuillaumeLample
1
2
17
0
0
1
@stephanedascoli
Stéphane d'Ascoli
10 months
@francoisfleuret Stochastic method: pick a learning rate eps and initialise m=x_0. Then if x_i > m, m+=eps, otherwise m-=eps. You can decay the learning rate etc.
0
0
1
@stephanedascoli
Stéphane d'Ascoli
4 years
@YaniKhezzar Ravi que ça vous ait plu, merci beaucoup ☺️
0
0
1
@stephanedascoli
Stéphane d'Ascoli
1 year
@mariabrbic I’ll miss you all too !
0
0
1