Shenghao Yang Profile
Shenghao Yang

@shenghao_yang

Followers
181
Following
283
Media
6
Statuses
50

Postdoc @UCBStatistics and International Computer Science Institute (ICSI). Prev @UWCheritonCS @AmazonScience

Berkeley, CA
Joined February 2021
Don't wanna be here? Send us removal request.
@VITAGroupUT
VITA Group
2 months
🎉 Huge congratulations to PhD student Peihao Wang (@peihao_wang ) on two major honors: 🏆 2025 Google PhD Fellowship in Machine Learning & ML Foundations 🌟 Stanford Rising Star in Data Science Incredibly proud of Peihao's outstanding achievements! 🔶⚡
2
1
16
@kfountou
Kimon Fountoulakis
7 months
Positional Attention is accepted at ICML 2025! Thanks to all co-authors for the hard work (64 pages). If you’d like to read the paper, check the quoted post. That's a comprehensive study on the expressivity for parallel algorithms, their in- and out-of-distribution learnability,
@kfountou
Kimon Fountoulakis
10 months
Positional Attention: Expressivity and Learnability of Algorithmic Computation (v2) We study the effect of using only fixed positional encodings (referred to as positional attention) in the Transformer architecture for computational tasks. These positional encodings remain the
1
10
46
@tydsh
Yuandong Tian
10 months
Our new work Spectral Journey https://t.co/1C4Hrxb2Ig shows a surprising finding: when a 2-layer Transformer is learned to predict the shortest path of a given graph, 1️⃣it first implicitly computes the spectral embedding for each edge, i.e. eigenvectors of Normalized Graph
Tweet card summary image
arxiv.org
Decoder-only transformers lead to a step-change in capability of large language models. However, opinions are mixed as to whether they are really planning or reasoning. A path to making progress...
8
91
468
@aseemrb
Aseem Baranwal
1 year
My PhD thesis is now available on UWspace: https://t.co/YrdI3Nupjq. Thanks to my advisors @kfountou and Aukosh Jagannath for their support throughout my PhD. We introduce a statistical perspective for node classification problems. Brief details are below.
3
2
8
@PetarV_93
Petar Veličković
1 year
"Energy continuously flows from being concentrated, to becoming dispersed, spread out, wasted and useless." ⚡➡️🌬️ Sharing our work on the inability of softmax in Transformers to _robustly_ learn sharp functions out-of-distribution. Together w/ @cperivol_ @fedzbar & Razvan!
11
81
476
@kfountou
Kimon Fountoulakis
1 year
Positional Attention: Out-of-Distribution Generalization and Expressivity for Neural Algorithmic Reasoning We propose calculating the attention weights in Transformers using only fixed positional encodings (referred to as positional attention). These positional encodings remain
10
61
308
@kfountou
Kimon Fountoulakis
1 year
I wrote a blog @Medium on "Random Data and Graph Neural Networks" Link: https://t.co/4rq6vkQFm0 I cover a range of topics: 1. How a single averaging graph convolution changes the mean and variance of the data. 2. How it improves linear classification. 3. How multiple
medium.com
The purpose of this blog is to summarize our understanding of the effects of the graph convolution operation(s) on a simple random data…
0
26
149
@backdeluca
Artur
1 year
For those participating in the Complex Networks in Banking and Finance Workshop, I’ll be presenting our work on Local Graph Clustering with Noisy Labels tomorrow at 9:20 AM EDT at the Fields Institute. Hope to see you there :) https://t.co/hzXIlTyKWt
Tweet card summary image
arxiv.org
The growing interest in machine learning problems over graphs with additional node information such as texts, images, or labels has popularized methods that require the costly operation of...
0
4
4
@kfountou
Kimon Fountoulakis
2 years
Paper: Analysis of Corrected Graph Convolutions We study the performance of a vanilla graph convolution from which we remove the principal eigenvector to avoid oversmoothing. 1) We perform a spectral analysis for k rounds of corrected graph convolutions, and we provide results
0
4
22
@kfountou
Kimon Fountoulakis
2 years
.@backdeluca is at ICLR and he will present his joint work with @shenghao_yang on "Local Graph Clustering with Noisy Labels". Date: Friday 10th of May. Time: 4:30pm - 6:30pm CEST. Place: Halle B #175.
0
3
16
@zdeborova
Lenka Zdeborova
2 years
Emergence in LLMs is a mystery. Emergence in physics is linked to phase transitions. We identify a phase transition between semantic and positional learning in a toy model of dot-product attention. Very excited about this one! https://t.co/ALb9D8YdfP
16
258
1K
@TheSIAMNews
SIAM
2 years
The November issue of SIAM News is now available! In this month's edition, @n_veldt finds that even a seemingly minor generalization of the standard #hypergraph cut penalty yields a rich space of theoretical questions and #complexity results. Check it out! https://t.co/Tv00MWd0B3
0
7
16
@kfountou
Kimon Fountoulakis
2 years
Graph Attention Retrospective is live at JMLR https://t.co/sB7nHPIp9F. The revised version has additional results: 1) Beyond perfect node classification, we provide a positive result on graph attention’s robustness against structural noise in the graph. In particular, our
@kfountou
Kimon Fountoulakis
4 years
New paper "Graph Attention Retrospective". One of the most popular type of models is graph attention networks. These models were introduced to allow a node to aggregate information from the features of neighbor nodes in a non-uniform way https://t.co/6atbcVaEDs
0
12
81
@aseemrb
Aseem Baranwal
3 years
Here's our new work on the optimality of message-passing architectures for node classification on sparse feature-decorated graphs! Thanks to my advisors and co-authors @kfountou and Aukosh Jagannath. Details within the quoted tweet.
@kfountou
Kimon Fountoulakis
3 years
Paper: Optimality of Message-Passing Architectures for Sparse Graphs. Work by @aseemrb. arXiv link: https://t.co/99Isy4Ul1n. I have been teaching a graduate course on graph neural networks this year. Close to the end of the course, many students noticed that all proposed
0
1
8
@kfountou
Kimon Fountoulakis
3 years
Alright, I have some important news (at least for me). Now there exists an accelerated personalized PageRank method which is strongly local!! It's running time does not depend on the size of the graph but rather only on the number of nonzeros at https://t.co/YIsYZy6msn
3
16
82
@siam_acda
SIAM ACDA
3 years
SIAM Conference on Applied and Computational Discrete Algorithms (ACDA23) May 31 – June 2, 2023 Seattle, Washington, U.S. New submission due dates: Registering a submission: Jan 16; Paper submission deadline; Jan 23.
0
4
4
@siam_acda
SIAM ACDA
3 years
SIAM Conference on Applied and Computational Discrete Algorithms (ACDA23), May 31 -- June 2, 2023 https://t.co/a89KZh1IKn Important dates: Short Abstract and Submission Registration: Jan 9, 2023 Papers and Presentations-without-papers: Jan 16, 2023 #SIAMACDA23
0
6
7
@kfountou
Kimon Fountoulakis
3 years
Open problem: accelerated methods for l1-regularized PageRank. https://t.co/UQjajkD9v8
0
3
18
@kfountou
Kimon Fountoulakis
4 years
Does it matter where you place the graph convolutions (GCs) in a deep network? How much better is a deep GCN vs an MLP? When are 2 or 3 GCs better than 1 GC? We answer those for node class., and a nonlinearly separable contextual stochastic block model. https://t.co/LVXHk2xDX9.
2
11
71
@HannesStaerk
Hannes Stark @ NeurIPS
4 years
New video with Prof. @kfountou explaining his paper "Graph Attention Retrospective" is now available! https://t.co/ZEPOIOvLkJ Check it out to learn what GATs can and cannot learn for node classification in a stochastic block model setting!
0
7
32