Ehsan Jahangiri
@ehsanjjjjj
Followers
86
Following
157
Media
0
Statuses
55
ML/SW engineer, math lover, multiplier, and always curious. - Principal MLE @ Nvidia, ex-Apple. - All opinions are mine.
San Francisco, CA
Joined June 2024
the face you make when you sell your vscode fork windsmurf for 3b
60
85
4K
Thinking of everyone in the path of Hurricane Milton. Please stay safe!
Hurricane Milton is fast approaching. If youāre in Central Florida and have an iPhone 14 or later, you can use Emergency SOS via satellite to reach emergency services even when cell service is down. This feature lets you make calls and send texts through satellite connection.
16
96
1K
@HopfieldJohn and @geoffreyhinton, along with collaborators, have created a beautiful and insightful bridge between physics and AI. They invented neural networks that were not only inspired by the brain, but also by central notions in physics such as energy, temperature, system
BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton āfor foundational discoveries and inventions that enable machine learning with artificial neural networks.ā
18
276
2K
Check out NotebookLM! Create a notebook, upload one or more sources (e.g. PDFs of research papers, your favorite PhD thesis, a newspaper article, etc) then click on 'Generate' to create a podcast of two voices talking about the content you've uploaded. https://t.co/FSCBvsr8tw
blog.google
NotebookLM is releasing Audio Overviews, which turns your sources into an engaging discussion.
94
378
2K
The Llama 3 paper is a must-read for anyone in AI and CS. Itās an absolutely accurate and authoritative take on what it takes to build a leading LLM, the tech behind ChatGPT, Gemini, Copilot, and others. The AI part might seem small in comparison to the gargantuan work on *data*
Why do 16k GPU jobs fail? The Llama3 paper has many cool details -- but notably, has a huge infrastructure section that covers how we parallelize, keep things reliable, etc. We hit an overall 90% effective-training-time. https://t.co/hsSIW4bayK
12
289
2K
It is remarkable that anyone can now train a 124M parameter LLM in about real-time on a MacBook M3. So easy to experiment. This would have been the stuff of dreams when I was in school. I ā¤ļø training neural nets, but I really admire the people who build the hardware.
16
29
533
25 years ago today I officially started at a small search engine company, wedged into a tiny office space above what's now a T-Mobile store in downtown Palo Alto. Since then, I have had the incredible pleasure of working with awesome colleagues on software used by billions of
196
143
4K
Apple Intelligence is going to unlock a world of new possibilities for our users, and itās thrilling to see our developers begin to build with it. Weāre excited to see the amazing things they create.
1K
1K
15K
As Apple Intelligence is rolling out to our beta users today, we are proud to present a technical report on our Foundation Language Models that power these features on devices and cloud: https://t.co/TaAdd0fBOp. š§µ
machinelearning.apple.com
We present foundation language models developed to power Apple Intelligence features, including a ā¼3 billion parameter model designed to runā¦
13
190
708
There are only two hard things in Computer Science: cache invalidation and naming things. -- Phil Karlton There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors. -- Leon Bambrick There's two hard problems in computer science: we
1
0
2
Memory Matters for LLM. While everyone is rushing to provide the serverless Llama3-405b model, I want to talk about one key choice that matters a lot, especially for dedicated enterprise deployments when traffic is not very high: memory. - The normal deployment of a model the
3
40
258
Llama 3 is the new PyTorch
@brbcatonfire Mark Z explains that here: https://t.co/5WHf6gnzhL
12
11
133
Exclusive: Meta just released Llama 3.1 405B ā the first-ever open-sourced frontier AI model, beating top closed models like GPT-4o across several benchmarks. I sat down with Mark Zuckerberg, diving into why this marks a major moment in AI history. Timestamps: 00:00 Intro
431
1K
9K
If your PhD advisor dresses like this, you donāt have to worry about using neural nets in your thesis
13
27
796
Great points. Intelligence and knowledge are related but not the same; we are currently evaluating LLMs for their knowledge (very much tied to memory). Adaptability on the other hand is a big factor for intelligence.
LLM model size competition is intensifying⦠backwards! My bet is that we'll see models that "think" very well and reliably that are very very small. There is most likely a setting even of GPT-2 parameters for which most people will consider GPT-2 "smart". The reason current
1
0
1