Soumith Chintala Profile
Soumith Chintala

@soumithchintala

Followers
186,905
Following
887
Media
180
Statuses
3,474

Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.

New York City
Joined September 2009
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@soumithchintala
Soumith Chintala
9 months
No More GIL! the Python team has officially accepted the proposal. Congrats @colesbury on his multi-year brilliant effort to remove the GIL, and a heartfelt thanks to the Python Steering Council and Core team for a thoughtful plan to make this a reality.
70
1K
5K
@soumithchintala
Soumith Chintala
8 days
apparently Google laid off their entire Python Foundations team, WTF! ( @SkyLi0n who is one of the pybind11 maintainers just informed me, asking what ways they can re-fund pybind11) The team seems to have done substantial work that seems critical for Google internally as well.…
Tweet media one
123
563
4K
@soumithchintala
Soumith Chintala
3 months
If you have questions about why Meta open-sources its AI, here's a clear answer in Meta's earnings call today from @finkd
Tweet media one
72
417
3K
@soumithchintala
Soumith Chintala
2 years
It’s been 5 years since we launched @pytorch . It’s much bigger than we expected -- usage, contributors, funding. We’re blessed with success, but not perfect. A thread (mirrored at ) about some of the interesting decisions and pivots we’ve had to make 👇
26
279
2K
@soumithchintala
Soumith Chintala
4 years
This is a new Microsoft. - WSL CUDA/GPU Support - native `winget` package manager - VSCode, Edge, buying Github They are listening. They are admitting failure. They are marching on. Considering the size and age of the company, that's really impressive.
35
256
2K
@soumithchintala
Soumith Chintala
11 months
i might have heard the same 😃 -- I guess info like this is passed around but no one wants to say it out loud. GPT-4: 8 x 220B experts trained with different data/task distributions and 16-iter inference. Glad that Geohot said it out loud. Though, at this point, GPT-4 is…
@pommedeterre33
Michaël Benesty
11 months
Unexpected description of GPT4 architecture from geohotz in a recent interview he gave. At least it’s plausible.
Tweet media one
14
92
544
57
387
2K
@soumithchintala
Soumith Chintala
4 months
Can finally talk some GPU numbers publicly 🙃 By the end of the year, Meta will have 600k H100-equivalent GPUs. Feel free to guess what's already deployed and being used 😉!
Tweet media one
86
190
2K
@soumithchintala
Soumith Chintala
2 years
anyone else feel burned out by a new AI breakthrough every week? 🤯 Trying to keep up but it goes by so fast most of it is not easily or locally reproducible which adds to the stress 😂😂😂
61
102
2K
@soumithchintala
Soumith Chintala
3 years
PyTorch co-author Sam Gross ( @colesbury ) has been working on removing the GIL from Python. Like...we can start using threads again instead of multiprocessing hacks! This was a multi-year project by Sam. Great article summarizing it:
13
321
2K
@soumithchintala
Soumith Chintala
2 months
Here's details on Meta's 24k H100 Cluster Pods that we use for Llama3 training. * Network: two versions RoCEv2 or Infiniband. * Llama3 trains on RoCEv2 * Storage: NFS/FUSE based on Tectonic/Hammerspace * Stock PyTorch: no real modifications that aren't upstreamed * NCCL with…
91
198
1K
@soumithchintala
Soumith Chintala
5 months
PyTorch's design origins, its connection to Lua, its intertwined deep connection to JAX, its symbiotic connection to Chainer The groundwork for PyTorch originally started in early 2016, online, among a band of Torch7's contributors. Torch7 (~2010-2017) These days, we also…
43
279
1K
@soumithchintala
Soumith Chintala
3 months
Based on all the user-request videos that @sama 's been posting, it looks like sora is powered by a Game Engine, and generates artifacts and parameters for the Game Engine. 🤔
@sama
Sam Altman
3 months
here is sora, our video generation model: today we are starting red-teaming and offering access to a limited number of creators. @_tim_brooks @billpeeb @model_mechanic are really incredible; amazing work by them and the team. remarkable moment.
2K
4K
26K
76
79
1K
@soumithchintala
Soumith Chintala
3 years
Deep Learning is not yet enough to be the singular solution to most real-world automation. You need significant prior-injection, post-processing and other engineering in addition. Hence, companies selling DL models as an API have slowly turned into consulting shops.
26
186
1K
@soumithchintala
Soumith Chintala
6 months
this weekend has been very sad. My friends at @OpenAI swore that it had become a magical place, with the talent density, velocity, research focus and (yet) a product fit that is really generational. For such a place to breakdown in the cringiest way possible is doubly sad.
34
61
1K
@soumithchintala
Soumith Chintala
6 months
* In 2016, I thought OpenAI was just shady, with highly unrealistic statements * In 2020, I thought OpenAI was doing awesome work, but a bit too hypey, and the AGI bonds were weird * In 2022, I fully changed my opinion and I think OpenAI is just phenomenal for changing the world.…
@sama
Sam Altman
6 months
i loved my time at openai. it was transformative for me personally, and hopefully the world a little bit. most of all i loved working with such talented people. will have more to say about what’s next later. 🫡
7K
10K
96K
17
76
1K
@soumithchintala
Soumith Chintala
3 years
Maybe I can finally use matplotlib now without spending half a day googling the exact syntax and options!
@OpenAI
OpenAI
3 years
Data science with OpenAI Codex. Full video:
37
443
2K
24
119
1K
@soumithchintala
Soumith Chintala
10 months
LLaMa-2 from @MetaAI is here! Open weights, free for research and commercial use. Pre-trained on 2T tokens. Fine-tuned too (unlike v1). 🔥🔥🔥 Lets gooo.... The paper lists the amazing authors who worked to make this happen night and day. Be sure to thank…
Tweet media one
31
186
1K
@soumithchintala
Soumith Chintala
25 days
Meta announces 2nd-gen inference chip MTIAv2. * 708TF/s Int8 / 353TF/s BF16 * 256MB SRAM, 128GB memory * 90W TDP. 24 chips per node, 3 nodes per rack. * standard PyTorch stack (Dynamo, Inductor, Triton) for flexibility Fabbed on TSMC's 5nm process, its fully programmable via the…
23
145
1K
@soumithchintala
Soumith Chintala
7 years
NIPS Conference Registrations 2002 thru 2019. [2018] War erupts for tickets [2019] AI researchers discover time travel
Tweet media one
20
393
1K
@soumithchintala
Soumith Chintala
6 years
Tensor Comprehensions: einstein-notation like language transpiles to CUDA, and autotuned via evolutionary search to maximize perf. Know nothing about GPU programming? Still write high-performance deep learning. @PyTorch integration coming in <3 weeks.
13
431
1K
@soumithchintala
Soumith Chintala
4 years
2013 me would not have had that loud GPU desktop from craigslist if Colab was around. Colab Pro at $9.99 / month to get prioritized 24-hour stints of fast GPUs is a steal. It is a product designed for individuals: login, create, go. Thanks Google :)
14
93
1K
@soumithchintala
Soumith Chintala
1 year
I'm fairly puzzled by $NVDA skyrocketing. GenAI inference and fine-tuning will significantly outweigh GenAI training in overall compute. When it comes to inference and fine-tuning, NVIDIA's advantage in software won't hold much significance. They will inevitably have to face…
78
140
1K
@soumithchintala
Soumith Chintala
8 months
CodeLlama -- a version of Llama2 that was fine-tuned for code tasks is live now. Available in 7B, 13B and 34B.
Tweet media one
17
210
962
@soumithchintala
Soumith Chintala
5 months
Seems to solidly compete with GPT-4 on benchmarks. Google has existing customers and surfaces to start the feedback loop, without worrying about adoption. And Google will use TPUs for inference, so doesn't have to pay NVIDIA their 70% margins (like @OpenAI and @Microsoft has to…
@GoogleDeepMind
Google DeepMind
5 months
We’re excited to announce 𝗚𝗲𝗺𝗶𝗻𝗶: @Google ’s largest and most capable AI model. Built to be natively multimodal, it can understand and operate across text, code, audio, image and video - and achieves state-of-the-art performance across many tasks. 🧵
174
2K
6K
18
83
958
@soumithchintala
Soumith Chintala
6 years
googled for an error message in pytorch, read my own answer from a year ago. full circle! Then dug around some stats. There are 34,300 posts on the PyTorch forums, viewed 7.6 million times. I wrote 1800 of them. So cool :D
13
57
917
@soumithchintala
Soumith Chintala
5 years
Rethinking floating point for deep learning - Jeff Johnson at FAIR - proposes non-linear floating point math -- more energy efficient, accurate - no retraining or quantization before deployment - Verilog, C++, PyTorch implementations available
1
269
869
@soumithchintala
Soumith Chintala
1 year
People getting mad about @OpenAI not releasing GPT4's research details.... the only tangible way to get back is to surpass GPT-4's results and release the details of how it was done. literally any other kind of criticism is a mere expression of anger and a big distraction.
71
68
839
@soumithchintala
Soumith Chintala
5 months
10 crazy years -- pytorch, detectron, segment-anything, llama, faiss, wav2vec, biggraph, fasttext, the Cake below the cherry, and so much more. Can't say we didn't change AI and to an extent the world.
@AIatMeta
AI at Meta
5 months
10 years of FAIR. 10 years of advancing the state of the art in AI through open research. We're celebrating the 10th anniversary of Meta's Fundamental AI Research team and continuing that legacy by sharing our work on three exciting new research projects today. Details below 🧵
28
157
794
23
60
833
@soumithchintala
Soumith Chintala
4 years
Debates on PyTorch vs TensorFlow were fun in 2017. There was healthy competition to innovate, and philosophical differences like Theano vs Torch, Emacs vs vim, or android vs iOS. Now both products look exactly the same, the debates are nonsense and boring. Please stop.
19
105
819
@soumithchintala
Soumith Chintala
8 years
Linux kernel push confirms, Intel to add dedicated neural network hardware ops into their future processors
Tweet media one
9
786
795
@soumithchintala
Soumith Chintala
2 months
reading "AI News" (previously Smol Talk) is probably the highest-leverage 45 mins I spend everyday on catching up with what's going on in AI. So much alpha, organized hierarchically; an exceptionally well curated (and summarized) daily newslettter. Probably only suited for…
14
70
796
@soumithchintala
Soumith Chintala
4 months
Perplexity has become my most used AI app by the end of 2023. I use it for fact-seeking questions -- including recent news / facts, summarizing opinions and recommendations on products and much more. ChatGPT + Browsing can do similar stuff, but its like 100x slower, and is often…
@perplexity_ai
Perplexity
4 months
We are happy to announce that we've raised $73.6 million in Series B funding led by IVP with participation from NVIDIA, NEA, Bessemer, Elad Gil, Jeff Bezos, Nat Friedman, Databricks, Tobi Lutke, Guillermo Rauch, Naval Ravikant, Balaji Srinivasan.
145
319
3K
17
56
786
@soumithchintala
Soumith Chintala
3 years
Getting @satyanadella to talk about PyTorch...✅ . . (Satya and I went to the same high school, Hyderabad Public School)
Tweet media one
13
46
775
@soumithchintala
Soumith Chintala
5 years
cancel meetings on your calendar until it sparks joy!
14
132
765
@soumithchintala
Soumith Chintala
2 years
PyTorch M1 GPU support (still in alpha) already being put to good use:
15
115
767
@soumithchintala
Soumith Chintala
1 year
@OpenAI everyday is a struggle to keep up. please slow down :)
17
34
746
@soumithchintala
Soumith Chintala
1 year
can @huggingface quickly start a Twitter clone for AI/ML folks 🤪
29
33
755
@soumithchintala
Soumith Chintala
2 years
We just released AITemplate -- a high-performance Inference Engine -- similar to TensorRT but open-source. It is really fast! On StableDiffusion, it is 2.5x faster than the XLA based version released last week.
@AIatMeta
AI at Meta
2 years
Get faster, more flexible inference on GPUs using our newly open-sourced AITemplate, a revolutionary new inference engine that delivers up to 12X performance improvements on NVIDIA GPUs & 4X on AMD GPUs compared to eager-mode within Pytorch. Learn more:
10
152
754
19
116
730
@soumithchintala
Soumith Chintala
6 months
In 270 days, the Department of Commerce will determine whether they will allow open-weights or not. if you support open model weights and want something actionable to do, then figure out how to lobby your opinion to them.
Tweet media one
35
171
717
@soumithchintala
Soumith Chintala
17 days
Llama3 8B and 70B are out, with pretty exciting results! * The ~400B is still training but results already look promising. * Meta's own Chat interface is also live at * TorchTune integration is shortly going live:
@Ahmad_Al_Dahle
Ahmad Al-Dahle
17 days
It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more…
Tweet media one
35
208
999
15
95
715
@soumithchintala
Soumith Chintala
4 years
me watching @PyTorch be dragged into the AGI tweetstorm
17
19
699
@soumithchintala
Soumith Chintala
6 months
to my researcher friends (at @openai and those who watched this saga unfold) -- if you want to focus on doing solid *open* and published research, have access to lots of hardware -- there are a few options: Meta, Mistral, etc. don't let your work get lost in corporate sagas!…
22
54
691
@soumithchintala
Soumith Chintala
13 days
Llama3-70B has settled at #5 . With 405B still to come next... I remember when GPT-4 released in March 2023, it looked like it was nearly-impossible to get to the same performance. Since then, I've seen @Ahmad_Al_Dahle and the rest of the GenAI org in a chaotic rise to focus,…
@lmsysorg
lmsys.org
13 days
Exciting update -- Llama-3 full result is out, now reaching top-5 on the Arena leaderboard🔥 We've got stable enough CIs with over 12K votes. No question now Llama-3 70B is the new king of open model. Its powerful 8B variant has also surpassed many larger-size models. What an…
Tweet media one
30
165
1K
17
48
681
@soumithchintala
Soumith Chintala
4 years
ML Code Completeness Checklist: consistent and structured information in the README makes your code more popular and usable. Sensible advice, backed by data. Proposed by @paperswithcode and now part of the NeurIPS Code Submission process. Read more:
Tweet media one
6
152
653
@soumithchintala
Soumith Chintala
2 months
the PyTorch codebase is 27 million tokens.
39
27
637
@soumithchintala
Soumith Chintala
6 years
use gradient magnitude as a signal for gradient importance. Sort your gradients, find a threshold, clip your gradients, exchange sparse gradients, win.
Tweet media one
10
236
642
@soumithchintala
Soumith Chintala
5 years
I find it really cool that today, you can: - go to - select NVIDIA's open-source WaveGlow model - open it in Google Colab - run the model and listen to it synthesize speech You can take that as a starting point and do further research.
4
129
641
@soumithchintala
Soumith Chintala
4 years
Super excited to welcome the PFN team to the @PyTorch community. With Chainer, CuPy, Optuna, MNCore, their innovations need no introduction. The community is going to get even more fun! :)
@PreferredNet
Preferred Networks
4 years
[News] Preferred Networks (PFN) migrates its DL platform from Chainer to PyTorch. Chainer moves to maintenance support. PFN jointly works with Facebook and the OSS community to develop PyTorch. For more information, please look at the news release:
2
217
426
6
142
642
@soumithchintala
Soumith Chintala
2 years
Small perks of joining the Linux Foundation! We spoke about ML Accelerators and Linux driver-land issues :D
@PyTorch
PyTorch
2 years
Two creators of passion projects that transformed the landscape of how we code today — Linus Torvalds and @soumithchintala meet for the first time, sharing a smile and a love for the open source community. #PyTorchFoundation
Tweet media one
18
68
685
1
40
636
@soumithchintala
Soumith Chintala
2 years
Big announcement: PyTorch Foundation! PyTorch has large core investments from many companies. So, we're creating a neutral foundation for securing assets and interests. Technical Governance is separate & secure in a Maintainer model. Here's more context:
7
108
644
@soumithchintala
Soumith Chintala
2 years
Everyone's been waiting for it! Thanks to Apple, working in closing collaboration with the core team for making this happen!
@PyTorch
PyTorch
2 years
We’re excited to announce support for GPU-accelerated PyTorch training on Mac! Now you can take advantage of Apple silicon GPUs to perform ML workflows like prototyping and fine-tuning. Learn more:
Tweet media one
79
710
3K
7
47
630
@soumithchintala
Soumith Chintala
4 years
- Take FasterRCNN - Remove clunky NMS, Proposals, ROIAlign, Refinement and their gazillion hyperparameters - Replace with Transformer - Win! Simplifies code and improves performance. Nice work from @fvsmassa (torchvision maintainer) and his collaborators at FAIR.
@AIatMeta
AI at Meta
4 years
We are releasing Detection Transformers (DETR), an important new approach to object detection and panoptic segmentation. It’s the first object detection framework to successfully integrate Transformers as a central building block in the detection pipeline.
Tweet media one
12
501
2K
5
104
630
@soumithchintala
Soumith Chintala
1 year
The race has publicly begun! - GPT still blows everyone else out of the water wrt coding abilities, with Claude coming a close second. - Bard can't do coding yet (according to their FAQ) - LLaMa / Alpaca, etc. is having a huge community-driven moment
28
53
632
@soumithchintala
Soumith Chintala
7 months
Open LLMs need to get organized and co-ordinated about sharing human feedback. It's the weakest link with Open LLMs right now. They don't have 100m+ people giving feedback like in the case of OpenAI/Anthropic/Bard. They can always progress with a Terms-of-Service arbitrage, but…
@bindureddy
Bindu Reddy
7 months
The pace of open-source LLM innovation and research is breath-taking I suspect that open-source will soon become unbeatable for anyone except maybe OpenAI Here's why - Open-source community is way bigger than any specific company - Safety lobotomy and fear of bad press will…
107
281
2K
27
82
626
@soumithchintala
Soumith Chintala
6 years
"Everybody Dance Now" from Caroline Chan, Alyosha Efros and team transfers dance moves from one subject to another The only way I'll ever dance well. Amazing work!!!
5
202
619
@soumithchintala
Soumith Chintala
6 years
Dont miss out on internships. Get as many under your belt as you can before joining full-time! I always wish I could go intern at Deepmind, Brain or NVIDIA for the summer just to know more about what it's like to work there, but it's no longer possible to do a 3-month stint.
21
71
608
@soumithchintala
Soumith Chintala
1 year
Had a practical search today that Google totally failed to answer, failed too, but ChatGPT gave an answer that sounds plausible. Now, the conundrum is that I don't know whether ChatGPT made stuff up or gave an accurate answer haha.
Tweet media one
49
36
570
@soumithchintala
Soumith Chintala
6 years
Cloud TPUs are out, we'll start sketching out @PyTorch integration. The cost is $6.50 per TPU-hour right now. Hopefully when they get affordable, we will be ready with PyTorch support :) Thanks @googleresearch who have been very open to the conversation of @PyTorch integration.
4
115
562
@soumithchintala
Soumith Chintala
7 years
NVIDIA releases an open-source Deep Learning Inference chip design (based on Xavier), with full verilog source:
Tweet media one
6
304
555
@soumithchintala
Soumith Chintala
10 months
No, GPT-3 wasn't trained in 11 minutes. The GPT-3 architecture was trained on the C4 dataset to 2.69 log-probability in 11 minutes on 3584 H100 GPUs. Don't focus on the "11 minutes" -- because it's like saying "ResNet-50 was trained in 5 seconds on MNIST to 80% accuracy"
18
47
553
@soumithchintala
Soumith Chintala
2 years
This is not a research paper, this is a real-world product. Wow! Been following @runwayml from their early days (and visited their offices last year). Great set of people, strong creative and product sense. Watch out for them.
@runwayml
Runway
2 years
Introducing Inpainting! Easily remove any object from a video with a few brush strokes. In real-time. Like magic 🪄 Get started for free:
33
262
2K
6
59
555
@soumithchintala
Soumith Chintala
5 years
very excited about the @GoogleAI office in Bangalore!
@sundarpichai
Sundar Pichai
5 years
At #GoogleForIndia today, we announced Google Research India - a new AI research team in Bangalore that will focus on advancing computer science & applying AI research to solve big problems in healthcare, agriculture, education, and more. #GoogleAI
263
2K
9K
4
38
545
@soumithchintala
Soumith Chintala
10 months
Here's @MosaicML showcasing their results with @PyTorch 2.0 + AMD for LLM Training. They made "Zero Code Changes" to run on AMD. MI250 is already trending well, and IMO MI300X will be very competitive.
@DbrxMosaicAI
Databricks Mosaic Research
10 months
Introducing training LLMs with AMD hardware! MosaicML + PyTorch 2.0 + ROCm 5.4+ = LLM training out of the box with zero code changes. With MosaicML, the ML community has additional hardware + software options to choose from. Read more:
Tweet media one
9
149
685
16
77
543
@soumithchintala
Soumith Chintala
7 months
AI is just matrix multiplications. Brains are just biochemical interactions. Must be simple.
28
51
535
@soumithchintala
Soumith Chintala
7 years
FAIR releases faiss. Many uses: text2image by searching through 1B or 100B images? RL Agent with VERY LARGE memory?
Tweet media one
5
248
515
@soumithchintala
Soumith Chintala
6 years
We're launching a FAIR Residency Program: a 1yr fixed-term research training program where you will work closely with researchers at FAIR. Deadline for applications is January 26, 2018. Do apply :)
11
169
514
@soumithchintala
Soumith Chintala
2 years
Twitter, if you're listening -- 1. get rid of the crypto bots 2. a feed that does some kind of Tweet TF-IDF. If someone tweets once a month, I want to see that as visibly as someone who tweets 10 times a day. 3. a feed API, so that we can build custom feeds again
18
30
509
@soumithchintala
Soumith Chintala
7 years
Congrats to the DenseNets authors for winning the CVPR best paper award. Elegant work, well deserved!
Tweet media one
1
182
501
@soumithchintala
Soumith Chintala
3 years
Two exciting news from our robotics research today. 1/ DIGIT: a vision-based touch-sensor Projects light into a gel in the "finger-tip". A camera + model rapidly estimates the changes in image to compute localized pressure Announcement that it is commercially available now!
@AIatMeta
AI at Meta
3 years
Today, as part of a larger tactile-sensing ecosystem, we’re announcing two major advances: DIGIT, a commercially available touch-sensing hardware produced in partnership with GelSight, and ReSkin, a replaceable, low-cost tactile skin. Read more about our work in touch sensing:
17
84
612
12
88
497
@soumithchintala
Soumith Chintala
5 years
The speech team @ FAIR released wav2letter++: - a fully convolutional speech recognition system - a C++ ML library on top of ArrayFire They recognized very early that C++ was their best option and dived in much before PyTorch C++ API existed. See:
9
113
501
@soumithchintala
Soumith Chintala
3 years
Over the past year, I've been doing robotics at FAIR. It's been lots of fun. My personal research goal is to build home robots: cooking, cleaning, etc. (1/x)
11
22
490
@soumithchintala
Soumith Chintala
6 months
Currently, we overfit to exploring AI techniques that work well on NVIDIA GPUs. With the AI Accelerator sanctions on China -- one interesting result might be that China forks silicon and explores a different idea-space in AI techniques than the rest of the world.
23
44
476
@soumithchintala
Soumith Chintala
1 year
this take is so bad, it's hard to comprehend where to start taking it apart! For one, it starts with academic peer-review pathology: "paper too simple, so cant be innovative -- reject". It equates "fundamental innovations" to "architectural innovations" which is like ughhh...
@bengoertzel
Ben Goertzel
1 year
1) ChatGPT is super cool and fun but it's important to recall OpenAI made basically zero fundamental innovations. Actually the basic innovation behind the GPT software was made at Google Brain in Mountain View
133
421
3K
13
23
461
@soumithchintala
Soumith Chintala
7 years
I've always wondered how to optimally read math-heavy papers. This seems like good advice:
5
157
453
@soumithchintala
Soumith Chintala
5 years
Object Detection systems (commercial and academic) are trained on biased data This disproportionally affects accuracy in lower-income households and continents like Africa & Asia. Work by my colleagues at FAIR using the Dollar Street dataset from GapMinder
Tweet media one
Tweet media two
Tweet media three
8
147
452
@soumithchintala
Soumith Chintala
4 years
. @OpenAI showed an cool code generation demo at #MSBuild2020 of a big language model trained on lots of github repositories The demo does some non-trivial codegen specific to the context. Eagerly waiting for more details! Video: starting at 28:45
5
155
447
@soumithchintala
Soumith Chintala
1 year
so excited to introduce @PyTorch 2.0, a year in the works. Still early, be gentle :)
@PyTorch
PyTorch
1 year
We just introduced PyTorch 2.0 at the #PyTorchConference , introducing torch.compile! Available in the nightlies today, stable release Early March 2023. Read the full post: 🧵below! 1/5
Tweet media one
23
524
2K
8
50
443
@soumithchintala
Soumith Chintala
29 days
thanks to @JeffDean and @SingularMattrix for their great leadership today; and @fchollet @dwarak and many others at @GoogleDeepMind for quickly charting a good and aligned path forward together. We can go back focusing on the unlimited amounts of good work ahead of us. (Jeff,…
16
25
446
@soumithchintala
Soumith Chintala
4 months
it pains me to see a poorly constructed benchmark coming from a credible source. I hope @anyscalecompute fixes things, and also consults other stakeholders before publishing such benchmarks. If I didn't know Anyscale closely, i would have attributed bad faith. Lets dive into…
@anyscalecompute
Anyscale
5 months
📈We’re excited to introduce the LLMPerf leaderboard: the first public and open source leaderboard for benchmarking performance of various LLM inference providers in the market. Our goal with this leaderboard is to equip users and developers with a clear understanding of the…
Tweet media one
10
45
169
8
45
309
@soumithchintala
Soumith Chintala
7 years
Pre-trained Word Embeddings for 90 languages trained using FastText, on Wikipedia. Even has my native Telugu!
Tweet media one
11
210
439
@soumithchintala
Soumith Chintala
5 years
It's incredible to see how far @pytorch has come as a community, while preserving our core values of pushing for simplicity and innovation. (1/x)
@PyTorch
PyTorch
5 years
Thank you to everyone who joined us for the PyTorch Developer Conference today both in San Francisco and via livestream around the world! #PTDC2019
Tweet media one
5
18
137
2
55
427
@soumithchintala
Soumith Chintala
5 months
Yesterday I read an 8-page paper. Breezed through it like a Netflix episode. Clear, concise, and considerate of my time. Somehow we've regressed to writing 30+ page epics (i'm guilty too).
16
7
435
@soumithchintala
Soumith Chintala
3 years
PapersWithCode go brrrrrrr....
@paperswithcode
Papers with Code
3 years
🎉 We've just crossed 5000 Datasets! 🎉 We now index and organize more than 5000 research datasets for machine learning. A huge thanks to the research community for their ongoing contributions. Browse the full catalogue here:
Tweet media one
15
489
2K
0
46
425
@soumithchintala
Soumith Chintala
10 months
wow didn't know this was happening. this is huge! scikit added support for pytorch and GPUs via array dispatch
@dillonniederhut
Dillon Niederhut PhD
10 months
New in @scikit_learn -- experimental support for building models on GPUs with @PyTorch - @thomasjpfan at #SciPy2023
Tweet media one
0
83
375
3
49
430
@soumithchintala
Soumith Chintala
1 year
Are you familiar with @NumFOCUS ? It's a non-profit that sponsors numpy, pandas, jupyter, Julia, scikit, matplotlib, and more. They've had decreased funding this year, let's help. Donate to @NumFOCUS today, and I will match all donations up to $10,000 within the next 48 hours!
17
127
420
@soumithchintala
Soumith Chintala
7 years
Wasserstein GANs pretty aptly summarized in this reddit comment:
Tweet media one
3
185
428
@soumithchintala
Soumith Chintala
2 months
Rodney Brooks explains that, according to early AI research, intelligence was "best characterized as the things that highly educated male scientists found challenging", such as chess, symbolic integration, proving mathematical theorems and solving complicated word algebra…
16
52
424
@soumithchintala
Soumith Chintala
4 years
Congratulations Jeff and team! In 2015 TensorFlow pushed framework engineering up a level and pushed everyone forward. JAX seems to be doing the same in 2020, so thanks for continually funding great frameworks out of @GoogleAI
@JeffDean
Jeff Dean (@🏡)
4 years
When we released @TensorFlow as an open source project in Nov. 2015, we hoped external machine learning researchers & practicioners would find it as useful as we had internally at @GoogleAI . Very proud to see us hit 100M downloads! 2015 blog post:
16
149
956
2
20
413
@soumithchintala
Soumith Chintala
5 years
the Cerebras chip is a technological marvel -- a real, working full-wafer chip with 18GB of register file! It's probably one of the first chips where data "feed" will become the bottleneck, even for fairly modern networks. Congrats @CerebrasSystems !
10
110
411
@soumithchintala
Soumith Chintala
4 months
gpt-fast now supports mixtral-8x7B, in addition to gpt/llama. 1000 lines of simple pytorch code blazing it out!
Tweet media one
2
69
404
@soumithchintala
Soumith Chintala
3 months
stuff that I worked on but can't talk about 😊
@ylecun
Yann LeCun
3 months
@nembal Meta has Its own AI assistant to help employees. It is trained on internal corporate data and code base. It's called Metamate.
12
14
262
8
9
411
@soumithchintala
Soumith Chintala
7 years
TF goes imperative with eager, pytorch getting static optimizations and production-ready with JIT and onnx. Worlds are slowly converging...
3
104
409
@soumithchintala
Soumith Chintala
3 months
nothing short of mind-blowing! holy shit, the future is getting crazy!
@sama
Sam Altman
3 months
here is sora, our video generation model: today we are starting red-teaming and offering access to a limited number of creators. @_tim_brooks @billpeeb @model_mechanic are really incredible; amazing work by them and the team. remarkable moment.
2K
4K
26K
13
14
406
@soumithchintala
Soumith Chintala
1 year
so many mainstreamers pumping AI now. a good filter is to check if they were pumping crypto before lol.
10
37
403
@soumithchintala
Soumith Chintala
1 year
@OpenAI Join the folks in the business of pushing the limits of open-science: @MetaAI , @Stanford , @StabilityAI , @huggingface and others. Help make this happen.
9
22
404
@soumithchintala
Soumith Chintala
6 years
a birds-eye view into Facebook's datacenter infra
Tweet media one
2
143
404
@soumithchintala
Soumith Chintala
6 years
this Nature machine intelligence thing is fine exploitation of researchers. Please lets not make it a thing. Its a decade when sci-hub has to exist hush-hush and Aaron Schwartz was legally ambushed because he downloaded a bunch of research. Let that sink in. Signed.
@tdietterich
Thomas G. Dietterich
6 years
Several machine learning researchers have signed a statement regarding the upcoming launch of Nature Machine Intelligence. If you agree, I encourage you to sign this as well.
33
1K
2K
6
94
398
@soumithchintala
Soumith Chintala
6 years
Spot-pricing on TPUs is getting good. We have prototyped TPU-PyTorch support (w. Google engineers), hammering coverage and performance now. Promising times....
@JeffDean
Jeff Dean (@🏡)
6 years
Google Cloud TPUs now offer preemptible pricing at ~70% off the reserved instance pricing. This means, for example, that you can train a ResNet-50 model for ~$7.50 instead of $25, or a Transformer neural translation model for ~$13 instead of $41. See:
Tweet media one
4
184
558
5
85
402