yule_gan Profile Banner
Yulu Gan Profile
Yulu Gan

@yule_gan

Followers
154
Following
599
Media
6
Statuses
35

PhD student @MITEECS @MIT_CSAIL @MIT_CBMM / ex @PKU1898 @MSFTResearch

Cambridge, MA
Joined October 2022
Don't wanna be here? Send us removal request.
@yule_gan
Yulu Gan
8 months
New paper at #NeurIPS2024!. In which we try to make a *small yet interpretable* model work. We use decision trees, which offer a fully transparent decision-making process, in an autoregressive manner to do language tasks. paper: (1/n)
Tweet media one
7
37
223
@yule_gan
Yulu Gan
20 days
RT @phillip_isola: Our computer vision textbook is now available for free online here:. We are working on adding so….
0
623
0
@yule_gan
Yulu Gan
3 months
RT @RichardSSutton: I’ve changed so little. From my 1978 Bachelor’s thesis:. “The adult human mind is very complex, but the question remain….
0
64
0
@yule_gan
Yulu Gan
5 months
RT @sainingxie: When I first saw diffusion models, I was blown away by how naturally they scale during inference: you train them with fixed….
0
70
0
@yule_gan
Yulu Gan
6 months
RT @deepseek_ai: 🚀 DeepSeek-R1 is here!. ⚡ Performance on par with OpenAI-o1.📖 Fully open-source model & technical report.🏆 MIT licensed: D….
0
7K
0
@yule_gan
Yulu Gan
6 months
RT @DrJimFan: Introducing NVIDIA Cosmos, an open-source, open-weight Video World Model. It's trained on 20M hours of videos and weighs from….
0
753
0
@yule_gan
Yulu Gan
6 months
RT @kenneth0stanley: @daniel_mac8 @nickcammarata @jeffclune I think my former OpenAI colleague @nickcammarata makes a good point here that….
0
8
0
@yule_gan
Yulu Gan
6 months
RT @haotiant1998: Personal update: I am excited to share that I will join @GoogleDeepMind next week after defending my PhD thesis @MITEECS….
0
55
0
@yule_gan
Yulu Gan
6 months
RT @akarshkumar0101: Very excited to share ASAL! .Artificial Life aims to recreate natural evolution, but is severely bottlenecked by hand-….
0
28
0
@yule_gan
Yulu Gan
7 months
RT @JeffDean: I and other members of the Gemini team are looking forward to chatting with @NeurIPS attendees tomorrow at the @GoogleDeepMin….
0
1
0
@yule_gan
Yulu Gan
7 months
RT @jeffclune: Likewise! Welcome to Vancouver and please come say hi if you want to meet and/or chat about any of the below topics (or any….
0
8
0
@yule_gan
Yulu Gan
7 months
I'll be presenting our poster on Thur, Dec 12 from 4:30 p to 7:30p PST at East Exhibit Hall A-C, booth #4807. Come and say hi if you're around!.
@yule_gan
Yulu Gan
8 months
New paper at #NeurIPS2024!. In which we try to make a *small yet interpretable* model work. We use decision trees, which offer a fully transparent decision-making process, in an autoregressive manner to do language tasks. paper: (1/n)
Tweet media one
0
0
12
@yule_gan
Yulu Gan
8 months
RT @ShivamDuggal4: Current vision systems use fixed-length representations for all images. In contrast, human intelligence or LLMs (eg: Ope….
0
66
0
@yule_gan
Yulu Gan
8 months
RT @GuangyuRobert: What will a world look like with 100 billion digital human beings?. Today we share our tech report on Project Sid – a gl….
0
250
0
@yule_gan
Yulu Gan
8 months
I like this reaction. An interesting theoretical question is: which is more powerful, ARDTs or multi-layer Transformers? Also, can it be scaled up?.
@adamnohejl
Adam Nohejl
8 months
Oh, wow! It doesn't have to be a transformer, it doesn't even have to be a “neural” model. Decision trees can also model language and solve tasks!.
1
0
3
@yule_gan
Yulu Gan
8 months
This project came out of an amazing collaboration with @GalantiTomer, Tomaso Poggio, and @EranMalach!. Check out our paper for more details! (n/n).
0
0
4
@yule_gan
Yulu Gan
8 months
We use the input sentence “Lily and Tom loved to play together and they found” to visualize part of the decision-making process in the first decision tree of the ensemble, revealing how the model's internal process unfolds. (6/n)
Tweet media one
0
0
6
@yule_gan
Yulu Gan
8 months
We also evaluate the model's reasoning abilities with the Big-Bench-Hard dataset, which involves assessing the truthfulness of propositions. This process mirrors a Turing machine, determining a definitive true or false outcome from the inputs. (5/n)
Tweet media one
0
0
5
@yule_gan
Yulu Gan
8 months
Our model can continue stories in Tinystories dataset by extending narratives in a manner similar to a finite state automaton. (4/n)
Tweet media one
0
0
3
@yule_gan
Yulu Gan
8 months
To train the model, we (1) use Word2Vec to encode words, (2) build a dataset using a sliding window, with the previous n tokens as x and the n+1 token as y, and (3) train Decision Trees on this dataset as a regression problem. (3/n)
Tweet media one
0
0
6
@yule_gan
Yulu Gan
8 months
We show our model can compute complex functions, such as automata (Thm. 3), Turing machines (Thm. 6), and sparse circuits (Thm. 7), using “chain-of-thought” computations. Our analysis covers the size, depth, and efficiency, highlighting their impressive computational power. (2/n)
Tweet media one
0
0
10