Samuel Vaiter @vaiter X Profile

Samuel Vaiter

@vaiter

Followers

3K

Following

2K

Media

225

Statuses

884

@CNRS Researcher in maths & computer science. My (current) focus is machine learning and optimization.

https://t.co/4B7AgEveMZ

Nice, France

Joined September 2012

Don't wanna be here? Send us removal request.

Samuel Vaiter

@vaiter

9 days

At the same time, I cannot write this message without telling that I am **very** worried about the current administrative evolution of our profession which is evolving quickly towards a system absolutely not calibrated to our duties with a purely accounting PoV.

0

2

Samuel Vaiter

@vaiter

9 days

(*) directeur de recherche is roughly equivalent to full prof w/o mandatory teaching duty in France. This is an incredible tenured civil servant position, and I am deeply grateful for it. DM me if you have questions about the process to enter CNRS.

1

0

1

Samuel Vaiter

@vaiter

9 days

Happy to share that I have been promoted to senior research scientist (directeur de recherche (*)) at @cnrs effective today! Grateful to my amazing students, collaborators and mentors who made this journey possible. I will continue my research at the math lab of UniCA (LJAD).

4

0

23

Nicolas Keriven

@n_keriven

2 months

I'm BEYOND THRILLED to tell you that our first original song is now available on all platforms! Plz share if you like it 🙏🏼🙏🏼😊 Cheers

0

1

3

Ambroise Odonnat

@AmbroiseOdonnat

3 months

🚀 NeurIPS@Paris is back for a 5th edition at the SCAI. Meet us in central Paris to discuss recent advances in AI! 📆 25th & 26th Nov. 2025 🌐 https://t.co/tgZJ1h2hIF 🎓 Committee: Chloé-Agathe Azencott, @BachFrancis, Claire Boyer, @gerardbiau , @VianneyPerchet , @jeanphi_vert

0

10

34

Samuel Vaiter

@vaiter

3 months

Described by Philip Gage https://t.co/CQcfVjERt6 as a compression mechanism in 1994 in The C Users Journal

0

1

Samuel Vaiter

@vaiter

3 months

Byte Pair Encoding is a tokenization method that starts with all characters as initial tokens. It iteratively merges the most frequent adjacent byte pairs in the text, adding new tokens to the vocabulary until reaching a predefined size. The output is a sequence of subword tokens

1

21

Christopher Morris

@chrsmrrs

3 months

If you are in the Bay Area, consider attending our workshop, "Graph Learning Meets TCS," at the Simons Institute ( https://t.co/vDzzRyR4kA).

simons.berkeley.edu

Graph learning is a branch of machine learning focusing on developing and studying methods to make predictions for vertices, subgraphs, or entire graphs. The field of graph learning has already...

1

8

40

Samuel Vaiter

@vaiter

3 months

q: can we find somewhere the missed attempts to solve pb6 from either google or openai?

0

2

Samuel Vaiter

@vaiter

3 months

SIAM Review on this algorithm (and variants) by Bauschke & Borwein (1996)

1

0

5

Samuel Vaiter

@vaiter

3 months

Cyclic projection onto convex sets is a method to find a point in the intersection of convex sets. It iteratively projects onto each set in a cycle, and it converges to a point in the intersection under restrictive assumptions.

2

8

81

Samuel Vaiter

@vaiter

3 months

Introduced by Vaswani et al. "Attention is all you need" https://t.co/xdNmX4lAoU Early occurence for NLP in Bahdanau et al. https://t.co/N928Oo47v0 *Incredible* explanation by @3blue1brown https://t.co/u1gjBicyE7 Annotated (Pytorch) verson of the paper

0

1

10

Samuel Vaiter

@vaiter

3 months

A single-head unmasked attention layer processes a sequence by assigning relevance scores between elements. It computes these scores to "decide" how much focus each part of the sequence should receive and combines the information.

2

46

267

Samuel Vaiter

@vaiter

3 months

Congratulations to Dr. Sophie Jaffard @SophieJaffard for a brillant PhD defense! We very lucky with Patricia to have you as a PhD student, and I am sure that you will have a brillant career ahead.

1

6

72

Samuel Vaiter

@vaiter

4 months

Paper by Hammersley (1956)

projecteuclid.org

0

7

Samuel Vaiter

@vaiter

4 months

Zeros of random polynomials with i.i.d. coefficients exhibit interesting convergence properties. In particular, normal coefficient leads to weak convergence of the empirical measure to the unit circle.

5

36

352

Samuel Vaiter

@vaiter

4 months

Paper:

0

2

9

Samuel Vaiter

@vaiter

4 months

The Neural Probabilistic Language Model (Bengio et al., 2000) introduced joint use of word embeddings and neural networks for next-token prediction. It replaced n-gram models by mapping words to a continuous vector space, and the use a standard neural networks architectures.

10

66

401

Samuel Vaiter

@vaiter

4 months

Neural ODE : https://t.co/MMPaz3VlYX ResNet : https://t.co/3VqJj6XHX5 Do Residual Neural Networks discretize Neural Ordinary Differential Equations? :

arxiv.org

Neural Ordinary Differential Equations (Neural ODEs) are the continuous analog of Residual Neural Networks (ResNets). We investigate whether the discrete dynamics defined by a ResNet are close to...

0

2

20

Samuel Vaiter

@vaiter

4 months

ResNet and Neural ODEs are closely related: ResNet uses discrete residual/skip connections, while Neural ODEs generalize this to continuous transformations using ODEs. Neural ODEs *can* be seen as the limit of ResNet as the number of layers approaches infinity.

3

45

399