_arohan_ Profile Banner
rohan anil Profile
rohan anil

@_arohan_

Followers
26K
Following
27K
Media
948
Statuses
9K

aspiring to understand deep learning

Joined December 2017
Don't wanna be here? Send us removal request.
@_arohan_
rohan anil
1 day
Who has the most risk appetite is what someone should ask?
1
0
4
@_arohan_
rohan anil
1 day
Number of VCs liking this from SF makes me think … there is space to raise money for a startup to make frontier roast dosas.
@_arohan_
rohan anil
2 days
Near the office. SF has stepped up its dosa game.
16
0
244
@itspawsh
itspawsh
3 months
Pawsh is coming soon!! With the aim to create beautifully designed, eco- friendly, and Ultra- Absorbent disposable dog pads
0
6
18
@_arohan_
rohan anil
2 days
I didn’t know this that MZ and PC have committed most of their wealth to good things.
@latentspacepod
Latent.Space
2 days
Priscilla Chan and Mark Zuckerberg co-founded the Chan Zuckerberg Initiative (CZI) in 2015, committing 99% of their Meta shares to advance science, education, and opportunity. As a pediatrician and CEO of Meta respectively, they've built CZI into one of the most ambitious
4
0
27
@_arohan_
rohan anil
2 days
Near the office. SF has stepped up its dosa game.
69
9
892
@_arohan_
rohan anil
2 days
I mean from meta.
0
0
9
@Sonic_Rumble
Sonic Rumble
2 days
He’s back ⚡ The Sonic Movie 3 crossover event is live! Defend Shibuya Crossing, unlock Movie Shadow, and discover the new Neon Suit Silver in the Red Star Ring Shop.
29
185
928
@_arohan_
rohan anil
2 days
Pytorch was a good contribution to accelerating deep learning progress.
6
1
86
@_arohan_
rohan anil
3 days
No other words hurt more.
@jordihays
Jordi Hays
3 days
@zachpogrob @johncoogan @tbpn shampoo is a lie
3
1
82
@_arohan_
rohan anil
3 days
There is a tbd joke somewhere here
@shaneguML
Shane Gu
3 days
No-op is an undervalued policy in life
0
1
21
@vllm_project
vLLM
4 days
Amazing work by @RidgerZhu and the ByteDance Seed team — Scaling Latent Reasoning via Looped LMs introduces looped reasoning as a new scaling dimension. 🔥 The Ouro model is now runnable on vLLM (nightly version) — bringing efficient inference to this new paradigm of latent
@RidgerZhu
Rui-Jie (Ridger) Zhu
9 days
Thrilled to release new paper: “Scaling Latent Reasoning via Looped Language Models.” TLDR: We scale up loop language models to 2.6 billion parameters, and pretrained on > 7 trillion tokens. The resulting model is on par with SOTA language models of 2 to 3x size.
3
37
234
@livekit
LiveKit
4 days
LiveKit is hosting our first DevDay on Nov 18th. Join us for new product announcements, live demos, and time to connect with the team and community as we shape the future of voice AI. If you’re in the Bay Area, meet us at the Frontier Tower in San Francisco. If not, we’ll also
Tweet card summary image
luma.com
LiveKit DevDay Overview Join us for the first LiveKit Developer Day — a gathering for builders, developers, and partners who are shaping the future of voice…
2
12
45
@shiringhaffary
Shirin Ghaffary
4 days
NEW: Amazon sent a cease and desist to Perplexity demanding it stop letting allowing its AI browser agent, Comet, to make purchases online for users. Perplexity is pushing back, accusing Amazon of bullying a smaller competitor and limiting user choice. https://t.co/9eWKGycKib
Tweet card summary image
bloomberg.com
Amazon.com Inc. is suing Perplexity AI Inc. to try and stop the startup from helping users buy items on the world’s largest online marketplace, setting up a showdown that may have implications for...
6
20
112
@kimmonismus
Chubby♨️
5 days
Geoffrey Hinton contradicts the economists’ claim that AI could also create new jobs; on the contrary, he believes there will be massive unemployment and sees nothing new being created.
135
115
719
@danijarh
Danijar Hafner
5 days
Today is my last day at @GoogleDeepMind. After almost exactly 10 years at Google including 12 internships and the last 2 1/2 years full time, it really feels like a chapter coming to an end. I'm grateful for all the experiences and friends I've made at Google and DeepMind. I
147
51
2K
@_arohan_
rohan anil
6 days
People took this tweet seriously.
@agihippo
yi
7 days
ablations are for the weak. just yolo your runs. (ok, do some small amount of ablations, but don't over do it). instinct is everything in ML and AI.
4
0
40
@_arohan_
rohan anil
7 days
Dropping a bit of Lore on this halloween that I got reminded of. Before the first TPU was taped out there was mostly async training of neural nets at Google production. The team was genuinely worried that sync training would be bad and there was a team considering figuring out
10
1
204
@_arohan_
rohan anil
8 days
Drinking wine and trick or treating is a good idea
0
0
16
@agarwl_
Rishabh Agarwal
8 days
I was puzzled by why their paper claims "bfloat16" training crashes -- since we trained for 100,000 GPU hours and 7K+ training steps for both dense and MoEs in the ScaleRL paper stably without any crashes. I think it matters what kind of GPUs they used -- they mention in the
@vwxyzjn
Costa Huang
8 days
This is amazing to see! Interestingly enough, I had a minimum repro on logprobs between bf16 / fp16 https://t.co/tSd5mitoX4, but I assumed bf16 is still needed for training. Really nice to see someone running end-to-end training and showing fp16 training is great!
19
29
446
@_arohan_
rohan anil
8 days
Credit attribution.
6
9
125
@_arohan_
rohan anil
9 days
Deciding between starting insanity hiit or just pushing on maximum weight lifting.
1
0
3