Keivan Alizadeh Profile
Keivan Alizadeh

@KeivanAlizadeh2

Followers
569
Following
218
Media
2
Statuses
33

Apple

Joined February 2020
Don't wanna be here? Send us removal request.
@KeivanAlizadeh2
Keivan Alizadeh
1 month
RT @MFarajtabar: 🧵 1/8 The Illusion of Thinking: Are reasoning models like o1/o3, DeepSeek-R1, and Claude 3.7 Sonnet really "thinking"? 🤔 O….
0
585
0
@KeivanAlizadeh2
Keivan Alizadeh
7 months
RT @i_mirzadeh: We have open-sourced GSM-Symbolic templates and generated data! 🎉.- Github: .- Hugging Face: https:….
0
4
0
@KeivanAlizadeh2
Keivan Alizadeh
8 months
How we can make the process of RLHF more robust? Using a simple trick: Instead of limiting the KL divergence to a single SFT model we can search around a model soup which resides in a higher reward space. Please check our interns great work!.
@AtoosaChegini
Atoosa Chegini
8 months
1/🔔Excited to share my internship work, SALSA: Soup-based Alignment Learning for Stronger Adaptation, (NeurIPS workshop paper)! 🎉. Proximal Policy Optimization (PPO) often limits exploration by keeping models tethered to a single reference model. SALSA, however, breaks free.
0
0
8
@KeivanAlizadeh2
Keivan Alizadeh
9 months
RT @MFarajtabar: 1/ LLM inference is very expensive; and LLMs don't necessarily use their full capacity to respond to a specific prompt. Th….
0
56
0
@KeivanAlizadeh2
Keivan Alizadeh
9 months
RT @MFarajtabar: ** Intern position on LLM reasoning **. @mchorton1991, @i_mirzadeh, @KeivanAlizadeh2.and I are co-hosting an intern positi….
0
25
0
@KeivanAlizadeh2
Keivan Alizadeh
9 months
RT @andrewglynch: nobody will remember:.- your salary.- how “busy you were”.- how many hours you worked. people will remember:.- nothing. Y….
0
1K
0
@KeivanAlizadeh2
Keivan Alizadeh
9 months
RT @MFarajtabar: 1/ Can Large Language Models (LLMs) truly reason? Or are they just sophisticated pattern matchers? In our latest preprint,….
0
1K
0
@KeivanAlizadeh2
Keivan Alizadeh
11 months
Hey Guys,.I'm gonna present LLM in a flash in ACL 2024. Hit me up if you are in Bangkok. . Updates from previous version:.- Llama 2 results.- Some results on Apple GPUs (Metal).- Speculative decoding .- Memory Latency Tradeoff.- Impact of longer generation.
0
6
45
@KeivanAlizadeh2
Keivan Alizadeh
3 years
RT @adityakusupati: Introducting🪆Matryoshka Representations for Adaptive Deployment🪆. TL;DR: up to 14× lower real-world classification & re….
0
99
0
@KeivanAlizadeh2
Keivan Alizadeh
4 years
RT @gabriel_ilharco: Instead of a single neural network, why not train lines, curves and simplexes in parameter space?. Fantastic work by @….
0
44
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @gabriel_ilharco: I've been seeing a lot of talk around the recent Vision Transformer (ViT) paper, so I thought I'd highlight some of m….
0
51
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @RAIVNLab: Catch us at this ECCV where @JamesPa91074457 and @sarahmhpratt from our lab present their works as spotlights!!!. VisualCOME….
0
3
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @sacmehtauw: Excited to share our work for diagnosing breast cancer. We extend self-attention mechanism to learn representations on 100s….
0
2
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
NED allows to compare methods across different tasks. It gives insights about the SOTA methods (in few-shot, unsupervised learning, etc) vs simple baselines.
0
0
4
@KeivanAlizadeh2
Keivan Alizadeh
5 years
Glad to be part of the NED team. A simple framework toward more realistic ML Systems. NED doesn't separate train and test. Just go iN thE wilD, collect data and evaluate. NED extends ML models to ML Systems, which contains both model and training strategy.
@RAIVNLab
UW RAIVN Lab
5 years
Sharing In The Wild: From ML Models to Pragmatic ML systems. In The Wild (NED) is a learning and evaluation framework designed to further progress towards general ML systems capable of excelling in the real world. Paper: Site:
Tweet media one
1
3
14
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @RAIVNLab: Hello, world! This is the twitter handle of the Reasoning, AI and VisioN lab, RAIVN Lab - like the bird, @uwcse led by Prof.….
0
19
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @Mitchnw: sharing Supermasks in Superposition (SupSup):. A model that sequentially learns thousands of tasks with negligible forgetting-….
0
56
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @sarahmhpratt: Excited to share Grounded Situation Recognition -- our work (with @yatskar, @LucaWeihs, Ali Farhadi, and @anikembhavi) on….
0
18
0
@KeivanAlizadeh2
Keivan Alizadeh
5 years
Check out our work on the Butterfly Transform (BFT), a new building block in convolutional neural networks. BFT fuses information among channels more efficient than standard 1*1 convolutions. PDF: Code: to appear at #CVPR2020
Tweet media one
2
29
80
@KeivanAlizadeh2
Keivan Alizadeh
5 years
RT @ehsanik: Check out our new work (with Daniel Gordon, Dieter Fox and Ali Farhadi) on unsupervised representation learning from unlabeled….
0
17
0