Roger Waleffe @RWaleffe X Profile

Roger Waleffe

@RWaleffe

Followers

77

Following

23

Media

2

Statuses

26

Computer Sciences PhD student at the University of Wisconsin-Madison

Joined June 2020

Don't wanna be here? Send us removal request.

Roger Waleffe

@RWaleffe

5 months

RT @ctnzr: Nemotron-H: A family of Hybrid Mamba-Transformer LLMs. * Hybrid architecture means up to 3X faster at the same accuracy.* Traine….

0

102

0

Roger Waleffe

@RWaleffe

1 year

RT @ctnzr: A 8B-3.5T hybrid SSM model gets better accuracy than an 8B-3.5T transformer trained on the same dataset:.* 7% attention, the res….

0

77

0

Roger Waleffe

@RWaleffe

1 year

RT @thodrek: Data pruning to reduce pertaining costs is hot, but fancy pruning can take just as long to select data as to train on all of i….

0

4

0

Roger Waleffe

@RWaleffe

2 years

RT @DisseminatePod: 🚨 "MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks" with @RWaleffe is available now! . 🎧….

buymeacoffee.com

Hey I'm Jack from Disseminate: The Computer Science Research Podcast!If you enjoy the show consider buying me a coffee so I can keep making the show. Thanks!

0

2

0

Roger Waleffe

@RWaleffe

2 years

Joint work with Patrik Okanovic @vmageirakos Kostis Nikolakakis @aminkarbasi @DKalogerias @nmervegurel @thodrek.

1

0

5

Roger Waleffe

@RWaleffe

2 years

See the preprint here: for extensive evaluations together with the convergence analysis and discussion on its generalization.

2

0

4

Roger Waleffe

@RWaleffe

2 years

Performance? *The* reduction in time-to-accuracy! Take ImageNet as an example and use less than 10% of data each epoch: accuracy will improve up to 29% over competing pruning methods while offering a runtime reduction of 7x!.

1

5

Roger Waleffe

@RWaleffe

2 years

Not convinced about using random sampling for data pruning? Consider twice! In our recent work, we introduce Repeated Sampling of Random Subsets: where we sample a subset of data at each epoch of training instead of only once at the beginning!.

3

9

40

Roger Waleffe

@RWaleffe

3 years

RT @keenuniverse: Marius, another amazing KGE (and more) library is now auto-formatting its code with black as of 🚀….

github.com

This PR adds auto-formatters and linters for the Python and CPP sources. They have been added to GitHub actions to enforce the same style for the code base. The pip install dependencies have also b...

0

2

0

Roger Waleffe

@RWaleffe

3 years

RT @ImmanuelTrummer: Roger Waleffe (@RWaleffe) from @wiscdb introduces the Marius++ system! .Check out the talk: @W….

0

2

0

Roger Waleffe

@RWaleffe

3 years

RT @ImmanuelTrummer: Roger Waleffe (@RWaleffe) shows how to train over billion-scale graphs on a single machine! .Join us at 1 PM ET via Zo….

0

2

0

Roger Waleffe

@RWaleffe

3 years

RT @thodrek: Scalability is a key factor limiting the use of Graph Neural Networks (GNNs) over large graphs; w/ @RWaleffe, @JasonMohoney ,….

0

5

0

Roger Waleffe

@RWaleffe

4 years

RT @thodrek: Accepted to #OSDI21: @JasonMohoney & @RWaleffe show how to train massive graph embeddings in a 𝘀𝗶𝗻𝗴𝗹𝗲 𝗺𝗮𝗰𝗵𝗶𝗻𝗲; don't burn $$$$….

0

9

0

Roger Waleffe

@RWaleffe

5 years

RT @StatMLPapers: Principal Component Networks: Parameter Reduction Early in Training. (arXiv:2006.13347v1 [cs.LG])

0

2

0

Roger Waleffe

@RWaleffe

5 years

RT @thodrek: 3/3 We term these networks Principal Component Networks (PCNs). Practical results: We show that converting wide networks to….

0

1

0

Roger Waleffe

@RWaleffe

5 years

RT @thodrek: 2/3 The secret sauce: Hidden layer activations in wide networks live in small subspaces! Train your wide-net for a few epochs,….

0

1

0

Roger Waleffe

@RWaleffe

5 years

RT @thodrek: 1/3 Super exciting new result by Roger (@RWaleffe) on how to find small networks that exhibit the same performance as overpara….

0

5

0