Ivan Lee Profile
Ivan Lee

@ivn1e

Followers
12
Following
11
Media
6
Statuses
14

Joined January 2018
Don't wanna be here? Send us removal request.
@zacknovack
Zachary Novack
2 months
We're organizing the AI for Music workshop at @NeurIPSConf in San Diego! We'll be accepting both papers + demos w/an initial deadline of August 22, well timed for early visibility on your ICASSP/ICLR drafts ๐Ÿ‘€ Check out the website for more:
aiformusicworkshop.github.io
NeurIPS 2025 Workshop on AI for Music
@hermanhwdong
Hao-Wen (Herman) Dong ่‘ฃ็š“ๆ–‡
2 months
๐Ÿ”ฅHappy to announce that the AI for Music Workshop is coming to #NeurIPS2025! We have an amazing lineup of speakers! We call for papers & demos (due on August 22)! See you in San Diego!๐Ÿ–๏ธ @chrisdonahuey @Ilaria__Manco @zawazaw @huangcza @McAuleyLabUCSD @zacknovack @NeurIPSConf
Tweet media one
3
12
53
@yongyi_zang
Yongyi Zang
3 months
๐ŸšจNew Audio Benchmark ๐ŸšจWe find standard LLMs can solve Music-QA benchmarks by just guessing from text only, + LALMs can still answer well when given noise instead of music! Presenting RUListening: A fully automated pipeline for making Audio-QA benchmarks *actually* assess
1
9
29
@danlu_ai
Danlu Chen
1 year
Can ancient (logograhpic) languages from 5,000 years ago be processed like modern ones using NLP? We found visual representation-based system for NLP on ancient logographic languages outperforms conventional Latin transliteration! Join us at Poster s3 - Mon 4pm #ACL2024 #NLProc
Tweet media one
6
37
160
@zacknovack
Zachary Novack
11 months
Ultra-fast text-to-music generation w/o degrading quality? Introducing Presto! Distilling Steps and Layers for Accelerating Music Generation ๐ŸŽน: https://t.co/kTTAYKKtTU ๐Ÿ“–: https://t.co/Newhxe6lI6 w/@__gzhu__ @CasebeerJonah @BergKirkpatrick @McAuleyLabUCSD @NicholasJBryan ๐Ÿงต
1
23
90
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick Since most of our chosen architectures are attention-free, what mechanism, analogous to induction heads, facilitates a similar role in ICL? We hope to explore such questions in future work. Thanks for reading! Code: https://t.co/8L8GSZpn5L Paper: https://t.co/WE3gnWSjM1 10/10
0
0
0
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick Finally, we evaluate architectures on language modeling. Mamba was the only one to reach parity with transformers, followed by Hyena and RWKV. Most exhibit an abrupt improvement in ICL score, a behavior associated with the formation of induction heads (Olsson 2022). 9/10
1
0
3
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick We also study a setting where models are given the option to either memorize or perform ICL. We find that transformers with rotary embeddings and Hyena strongly prefer ICL over memorization. Surprisingly, RetNet almost always chooses to memorize. 8/10
1
1
3
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick While all architectures are capable of multiclass classification, all except for RNNs and CNNs perform better than logistic regression (black) in the most difficult setting. However, performance degrades quickly as we extrapolate beyond training lengths. 7/10
1
0
1
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick Again, all are capable of linear regression. Specifically, Mamba, RetNet, and transformers achieved performance comparable to that of ridge regression (black) for context lengths seen during training. While no architecture extrapolated well, RetNet proved the most stable. 6/10
1
0
3
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick All are capable of associative recall, with transformers, RetNet, Hyena, Mamba, and RWKV performing best as the difficulty increases. The latter three, in particular, excel when extrapolating beyond the number of examples seen during training (right of vertical line). 5/10
1
0
1
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick In short, we find that all architectures are capable of ICL, even RNNs and CNNs. Not surprisingly, transformers are strong in-context learners. However, a number of alternatives such as RWKV, RetNet, Mamba, and Hyena prove to be equally, and sometimes more, capable. 4/10
1
1
5
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick To address this, we study ICL in controlled, synthetic environments that eliminate the possibility of memorization: we train models from scratch to take a labeled dataset as input and predict the result of learning from this data in the forward-pass. 3/10
Tweet media one
1
0
2
@ivn1e
Ivan Lee
1 year
@nanjiangwill @BergKirkpatrick Studying ICL in LLMs is challenging. Are these models truly learning new predictors during the forward-pass (ICL), or do in-context examples simply focus the model on aspects of knowledge already acquired during gradient-based pretraining (memorization)? 2/10
1
0
2
@ivn1e
Ivan Lee
1 year
Is attention required for ICL? We explore this question in our #ICLR2024 paper Exploring the Relationship Between Model Architecture and In-Context Learning Ability. Code: https://t.co/8L8GSZpn5L Paper: https://t.co/WE3gnWSjM1 with @nanjiangwill and @BergKirkpatrick 1/10
1
3
27