bryanislucky Profile Banner
Bolin Lai Profile
Bolin Lai

@bryanislucky

Followers
133
Following
138
Media
20
Statuses
60

PhD student @GeorgiaTech with research interest in multimodal learning, generative models and video understanding. I'm now a visiting student @CSL_Ill in UIUC.

Atlanta, GA
Joined April 2017
Don't wanna be here? Send us removal request.
@bryanislucky
Bolin Lai
11 months
Our paper was nominated in the Best Paper Finalist of #ECCV2024. I sincerely thank all co-authors. Our work was also reported by Georgia Tech @ICatGT . My advisor @RehgJim will present it on Oct 2 1:30pm at Oral 4B Session, and Oct 2 4:30pm at #240 of Poster Session.@eccvconf
Tweet media one
@ICatGT
Georgia Tech School of Interactive Computing
11 months
LEGO can show you how it's done! New @eccvconf work from @bryanislucky, a new generative tool can produce visual images to accompany step-by-step instructions with just a single first-person photo uploaded into the prompt. #wecandothat🐝 @GTResearchNews.
0
6
39
@bryanislucky
Bolin Lai
3 months
RT @RehgJim: Very happy to be in Music City for #CVPR2025 My lab is presenting 7 papers, 4 selected as highlights. My amazing students @Iro….
0
10
0
@grok
Grok
25 days
Generate videos in just a few seconds. Try Grok Imagine, free for a limited time.
491
862
4K
@bryanislucky
Bolin Lai
3 months
RT @gtcomputing: Howdy from Nashville, ya'll! 🎸🤠 . Check out our stars at #CVPR2025, a top @IEEEorg research venue for computer vision expe….
0
2
0
@bryanislucky
Bolin Lai
5 months
The full Llama4 will contain 2T parameters. This is quite amazing to learn "billion" is insufficient to describe the scale of LLMs.
@AIatMeta
AI at Meta
5 months
Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout.• 17B-active-parameter model
Tweet media one
0
0
1
@bryanislucky
Bolin Lai
5 months
💻The work was done at GenAI Meta. Thank all collaborators at Meta and my advisor @RehgJim for their strong support.🍻 [8/8]. 📄Paper: ⌨️Code: ▶️Video:
0
0
0
@bryanislucky
Bolin Lai
5 months
🔎In addition, when different exemplar image pairs are used with the same textual instruction, InstaManip can capture the different visual patterns and apply them in editing query images. [7/8]
Tweet media one
1
0
0
@bryanislucky
Bolin Lai
5 months
📈It’s easy to scale up our model by increasing the number of exemplar images or improving the diversity of visual examples. [6/8]
Tweet media one
1
0
0
@bryanislucky
Bolin Lai
5 months
🏆Our model is able to learn the underlying image transformation from textual instruction and visual examples effectively, and edit the new query image accordingly. [5/8]
Tweet media one
Tweet media two
Tweet media three
1
0
0
@bryanislucky
Bolin Lai
5 months
💡To avoid learning misleading visual patterns in exemplar images, we introduce an innovative relation regularization strategy, which enforces embedding similarity of similar editing instructions in a batch, and encourages the distinction of different operations. [4/8]
Tweet media one
1
0
0
@bryanislucky
Bolin Lai
5 months
💡We propose Group Self-Attention to decompose in-context learning process into two separate stages – learning and applying, which simplifies the problem into two easier tasks. The model learns a transferrable embedding of desired transformation by next-token prediction. [3/8]
Tweet media one
1
0
1
@bryanislucky
Bolin Lai
5 months
📌Learning from exemplar images requires strong reasoning capability. Diffusion models are good at generation, yet still weak in reasoning. We leverage and enhance the in-context learning feature of autoregressive architectures to achieve SOTA for few-shot image editing. [2/8].
1
0
0
@bryanislucky
Bolin Lai
5 months
📢#CVPR2025 Introducing InstaManip, a novel multimodal autoregressive model for few-shot image editing. 🎯InstaManip can learn a new image editing operation from textual and visual guidance via in-context learning, and apply it to new query images. [1/8].
1
5
11
@bryanislucky
Bolin Lai
6 months
RT @_akhaliq: Alibaba just dropped Wan2.1. open AI Video Generation. #1 on VBench leaderboard, outperforming SOTA open-source & commercial….
0
146
0
@bryanislucky
Bolin Lai
9 months
An awesome gaze model from my labmate @fionakryan!.
@fionakryan
Fiona Ryan
9 months
Introducing Gaze-LLE, a new model for gaze target estimation built on top of a frozen visual foundation model!. Gaze-LLE achieves SOTA results on multiple benchmarks while learning minimal parameters, and shows strong generalization. paper:
0
0
4
@bryanislucky
Bolin Lai
11 months
RT @gtcomputing: #ECCV2024 has honored this computer vision research as one of 15 Best Paper Award candidates 🎉! Congrats to the team and l….
0
2
0
@bryanislucky
Bolin Lai
11 months
RT @RehgJim: Super excited to be in Milan for #ECCV2024. I have an opening for a Postdoc in my lab at UIUC, in the areas of egocentric comp….
0
5
0
@bryanislucky
Bolin Lai
1 year
I also appreciate @sangminlee777 for the valuable discussions in paper writing and rebuttal!.
0
0
2
@bryanislucky
Bolin Lai
1 year
Our ECCV paper is recognized as oral presentation! .Thank all co-authors (@aptx4869ml, Xiaoliang Dai, Lawrence Chen, Guan Pang, @RehgJim ) for your awesome contributions. Our dataset and codes have been released. Project: Code:
Tweet card summary image
github.com
[ECCV2024, Oral, Best Paper Finalist] This is the official implementation of the paper "LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning". - BolinLai/LEGO
@bryanislucky
Bolin Lai
1 year
While learning new skills, have you ever felt tired of reading the verbose manual or annoyed about the unclear instructions? Check out our #ECCV2024 work on generating egocentric (first-person) visual guidance tailored to the user's situation! [1/7]. Page:
2
2
14