Hongkang Li @LiHongkang_jntm X Profile

Hongkang Li

@LiHongkang_jntm

Followers

38

Following

87

Media

5

Statuses

24

Ph.D student at Rensselaer Polytechnic Institute

Troy, NY

Joined October 2019

Don't wanna be here? Send us removal request.

Hongkang Li

@LiHongkang_jntm

4 months

🔥Our #ICLR2025 Oral paper "When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers" will be presented on 04/26, 4:18 p.m. — 4:30 p.m. at Garnet 216-218. Poster pre will be on 04/26, 10:00 a.m. — 12:30 p.m. #341.

2

1

13

Hongkang Li

@LiHongkang_jntm

4 months

🔥Our #ICLR2025 poster paper "Training Nonlinear Transformers for Chain-of-Thought Inference: A Theoretical Generalization Analysis" will be presented on 04/25, 3:00 p.m. to 5:30 p.m. #342. Link to the paper:

0

8

Hongkang Li

@LiHongkang_jntm

4 months

Here is the link to the paper For Chinese readers, here is another summary

0

1

Hongkang Li

@LiHongkang_jntm

4 months

This work .1, characterizes the effectiveness of task addition and negation in multi-task learning and unlearning, respectively; .2, proves out-of-domain generalization guarantees with task arithmetic;.3, justifies low-rank approximation and model pruning for task vectors.

0

1

Hongkang Li

@LiHongkang_jntm

7 months

This work as been accepted by #ICLR2025. Please see this link . We will update our final version soon.

openreview.net

Chain-of-Thought (CoT) is an efficient prompting method that enables the reasoning ability of large language models by augmenting the query using multiple examples with multiple intermediate steps....

Hongkang Li

@LiHongkang_jntm

10 months

🚀Excited to share our new preprint on the theoretical analysis of training and generalization of chain-of-thought. The Arxiv link can be found at We have the following results. [1/n].

0

4

Hongkang Li

@LiHongkang_jntm

10 months

Q3: When and why is CoT better than ICL?. A3: By studying inference with (contained) erroneous steps, we show a successful ICL needs one more condition that the correct input-output examples are dominant, while CoT does not require it. [4/n].

0

Hongkang Li

@LiHongkang_jntm

10 months

Q2: What is the mechanism of CoT? . A2: On a simplified model and data formulation, we prove that Transformers implement few-shot CoT by attending the current query to the most similar inference step of the previous examples. [3/n].

1

0

Hongkang Li

@LiHongkang_jntm

10 months

Q1: Can a Transformer be trained to implement CoT provably?. A1: Yes, based on some data and task formulation, we provide a theoretical analysis of training a one-layer nonlinear Transformer to implement chain-of-thought on out-of-domain tasks with generalization gurantee. [2/n].

1

0

Hongkang Li

@LiHongkang_jntm

10 months

🚀Excited to share our new preprint on the theoretical analysis of training and generalization of chain-of-thought. The Arxiv link can be found at We have the following results. [1/n].

1

Hongkang Li

@LiHongkang_jntm

1 year

Our follow-up work on the LLM theory---- the learning and generalization mechanism of Chain-of-Thought (CoT), will be presented in the next two days of the @icmlconf workshops. 1. Fri 26 Jul., Straus 2, HiLD Workshop. 2. Sat 27 Jul., Straus 2, TF2M Workshop.

0

2

4

Hongkang Li

@LiHongkang_jntm

1 year

Thanks @IBMResearch for posting a blog about our work on in-context learning. Please see this link:

research.ibm.com

A team at IBM Research and RPI figured out why in-context learning improves foundation model predictions, adding transparency to machine learning.

Hongkang Li

@LiHongkang_jntm

1 year

🔥Excited to share our poster at #ICML2024. This work studies the training dynamics of nonlinear Transformers, together with the In-Context Learning generalization capability of the model. Time: Jul 23rd, Tuesday, 1:30-3:00 pm. Location: Hall C 4-9 #403.

0

3

Hongkang Li

@LiHongkang_jntm

1 year

RT @pinyuchenTW: Are you a big fan of in-context learning (ICL)? Check out our @IBMResearch blog post highlighting our @icmlconf paper demy….

research.ibm.com

A team at IBM Research and RPI figured out why in-context learning improves foundation model predictions, adding transparency to machine learning.

0

4

0

Hongkang Li

@LiHongkang_jntm

1 year

RT @sijialiu17: The 3rd AdvML-Frontiers Workshop (@AdvMLFrontiers is set for #NeurIPS 2024 (@NeurIPSConf)! This ye….

0

8

0

Hongkang Li

@LiHongkang_jntm

1 year

🔥Excited to share our poster at #ICML2024. This work studies the training dynamics of nonlinear Transformers, together with the In-Context Learning generalization capability of the model. Time: Jul 23rd, Tuesday, 1:30-3:00 pm. Location: Hall C 4-9 #403.

4

2

10

Hongkang Li

@LiHongkang_jntm

1 year

Another work at #ICML2024. This work theoretically studies the training and generalization of a one-layer Graph Transformer with trainable positional encoding. Time: Jul. 24th, Wednesday, 1:30-3:00 pm. Location: Hall C 4-9 #506

5

4

8