LiHongkang_jntm Profile Banner
Hongkang Li Profile
Hongkang Li

@LiHongkang_jntm

Followers
38
Following
87
Media
5
Statuses
24

Ph.D student at Rensselaer Polytechnic Institute

Troy, NY
Joined October 2019
Don't wanna be here? Send us removal request.
@LiHongkang_jntm
Hongkang Li
4 months
🔥Our #ICLR2025 Oral paper "When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers" will be presented on 04/26, 4:18 p.m. — 4:30 p.m. at Garnet 216-218. Poster pre will be on 04/26, 10:00 a.m. — 12:30 p.m. #341.
Tweet media one
2
1
13
@LiHongkang_jntm
Hongkang Li
4 months
🔥Our #ICLR2025 poster paper "Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis" will be presented on 04/25, 3:00 p.m. to 5:30 p.m. #342. Link to the paper:
Tweet media one
0
0
8
@LiHongkang_jntm
Hongkang Li
4 months
Here is the link to the paper For Chinese readers, here is another summary
0
1
1
@LiHongkang_jntm
Hongkang Li
4 months
This work .1, characterizes the effectiveness of task addition and negation in multi-task learning and unlearning, respectively; .2, proves out-of-domain generalization guarantees with task arithmetic;.3, justifies low-rank approximation and model pruning for task vectors.
0
0
1
@LiHongkang_jntm
Hongkang Li
7 months
This work as been accepted by #ICLR2025. Please see this link . We will update our final version soon.
openreview.net
Chain-of-Thought (CoT) is an efficient prompting method that enables the reasoning ability of large language models by augmenting the query using multiple examples with multiple intermediate steps....
@LiHongkang_jntm
Hongkang Li
10 months
🚀Excited to share our new preprint on the theoretical analysis of training and generalization of chain-of-thought. The Arxiv link can be found at We have the following results. [1/n].
0
0
4
@LiHongkang_jntm
Hongkang Li
10 months
Q3: When and why is CoT better than ICL?. A3: By studying inference with (contained) erroneous steps, we show a successful ICL needs one more condition that the correct input-output examples are dominant, while CoT does not require it. [4/n].
0
0
0
@LiHongkang_jntm
Hongkang Li
10 months
Q2: What is the mechanism of CoT? . A2: On a simplified model and data formulation, we prove that Transformers implement few-shot CoT by attending the current query to the most similar inference step of the previous examples. [3/n].
1
0
0
@LiHongkang_jntm
Hongkang Li
10 months
Q1: Can a Transformer be trained to implement CoT provably?. A1: Yes, based on some data and task formulation, we provide a theoretical analysis of training a one-layer nonlinear Transformer to implement chain-of-thought on out-of-domain tasks with generalization gurantee. [2/n].
1
0
0
@LiHongkang_jntm
Hongkang Li
10 months
🚀Excited to share our new preprint on the theoretical analysis of training and generalization of chain-of-thought. The Arxiv link can be found at We have the following results. [1/n].
1
1
1
@LiHongkang_jntm
Hongkang Li
1 year
Our follow-up work on the LLM theory---- the learning and generalization mechanism of Chain-of-Thought (CoT), will be presented in the next two days of the @icmlconf workshops. 1. Fri 26 Jul., Straus 2, HiLD Workshop. 2. Sat 27 Jul., Straus 2, TF2M Workshop.
Tweet media one
0
2
4
@LiHongkang_jntm
Hongkang Li
1 year
Thanks @IBMResearch for posting a blog about our work on in-context learning. Please see this link:
Tweet card summary image
research.ibm.com
A team at IBM Research and RPI figured out why in-context learning improves foundation model predictions, adding transparency to machine learning.
@LiHongkang_jntm
Hongkang Li
1 year
🔥Excited to share our poster at #ICML2024. This work studies the training dynamics of nonlinear Transformers, together with the In-Context Learning generalization capability of the model. Time: Jul 23rd, Tuesday, 1:30-3:00 pm. Location: Hall C 4-9 #403.
Tweet media one
0
0
3
@LiHongkang_jntm
Hongkang Li
1 year
RT @sijialiu17: The 3rd AdvML-Frontiers Workshop (@AdvMLFrontiers is set for #NeurIPS 2024 (@NeurIPSConf)! This ye….
0
8
0
@LiHongkang_jntm
Hongkang Li
1 year
🔥Excited to share our poster at #ICML2024. This work studies the training dynamics of nonlinear Transformers, together with the In-Context Learning generalization capability of the model. Time: Jul 23rd, Tuesday, 1:30-3:00 pm. Location: Hall C 4-9 #403.
Tweet media one
4
2
10
@LiHongkang_jntm
Hongkang Li
1 year
Another work at #ICML2024. This work theoretically studies the training and generalization of a one-layer Graph Transformer with trainable positional encoding. Time: Jul. 24th, Wednesday, 1:30-3:00 pm. Location: Hall C 4-9 #506
Tweet media one
5
4
8