Jay Alammar @JayAlammar profile

Jay Alammar

@JayAlammar

Followers

35,468

Following

1,260

Media

479

Statuses

1,773

Machine learning and language models R&D. Builder. Writer. Visualizing AI, ML, and LLMs one concept at a time. @Cohere .

https://t.co/vxmacFLzIZ

Joined April 2020

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

#LCDLF4 • 262476 Tweets

Renzo • 149302 Tweets

SAROCHA REBECCA ON RED CARPET • 76411 Tweets

Maripily • 54873 Tweets

FELIX ENAMORA A BARCELONA • 38849 Tweets

BLINDAJE FURIOSO • 37285 Tweets

Oilers • 27831 Tweets

定額減税 • 27126 Tweets

Keys For Healthy Life • 24277 Tweets

梅雨入り • 23580 Tweets

Birds Nurturing • 21997 Tweets

Gunther • 21008 Tweets

給与明細 • 19892 Tweets

Vancouver • 18648 Tweets

#Canucks • 17939 Tweets

#ファンパレハーフアニバーサリー • 17596 Tweets

Edmonton • 16006 Tweets

राजीव गांधी • 15602 Tweets

金額明記 • 15284 Tweets

Amber Rose • 14089 Tweets

Lyra • 12212 Tweets

キングダムハーツ • 12120 Tweets

#ゴンチャの新作 • 12008 Tweets

うちゅ友

Honeys

リキッドルーム

NNINE FIRST SOLOALBUM

キンハー

Stars in 5

आधुनिक भारत

値上げ了承

Cody Ceci

McLeod

トリプルコントローラー

天保江戸

レッドロブスター

ドジャース打線

Clay Holmes

手紙110円

Cienciano

Kevin Ortega

国民実感

Framber

マスカット

Portocarrero

連続自動指揮

Silovs

シラクザーノ

フリーマン

#WWERaw

Last Seen Profiles

@seungsbns

@AmazingToastah

@racket_service

@gencatrussia

@My00oo9

@Michele4Canada4

@IamSTANBACK

@LilUziAkon

@kuraguy

@Batty_bat_pone

@EtreProf

@votevets

@BrentsandersSA

@yackssirDiaz

@NMWATokyo

@BicycleRob

@MillerStream

@doc_yildiz

@MarshaWietecha

@Jupicarcom

Pinned Tweet

Jay Alammar

@JayAlammar

2 months

And here you have it! The cover for Hands-On Large Language Models. And the animal is: *drum roll* The Red Kangaroo! Why the Red Kangaroo? The process of choosing cover animals is a closely guarded secret held deep within the legendary halls of @OReillyMedia . @MaartenGr and I

14

57

467

Jay Alammar

@JayAlammar

4 years

How GPT3 works. A visual thread. A trained language model generates text. We can optionally pass it some text as input, which influences its output. The output is generated from what the model "learned" during its training period where it scanned vast amounts of text. 1/n

31

778

2K

Jay Alammar

@JayAlammar

2 years

The Illustrated Stable Diffusion New post! Over 30 visuals explaining how Stable Diffusion works (diffusion, latent diffusion, CLIP, and a lot more).

31

500

2K

Jay Alammar

@JayAlammar

2 years

pip install scikit-learn It's easy to take for granted, but this single command gives you functionality I'd value at hundreds of thousands of dollars, if not more. Not to mention amazing documentation that beautifully weaves guides and references. Hats off to @scikit_learn

16

135

1K

Jay Alammar

@JayAlammar

2 years

A 🧵looking at DeepMind's Retro Transformer, which at 7.5B parameters is on par with GPT3 and models 25X its size in knowledge-intensive tasks. A big moment for Large Language Models (LLMs) for reasons I'll mention in this thread.

10

197

966

Jay Alammar

@JayAlammar

3 years

Presenting the Explainable AI Cheat Sheet: Video: Cheat Sheet: A high-level map to major categories of ML Explainability. Informed by excellent work by @ChristophMolnar @IAugenstein @sameer_ and others. Plenty of links!

6

203

815

Jay Alammar

@JayAlammar

2 years

AI image generation is the most recent mind-blowing AI capability. #StableDiffusion is a clear milestone in this development because it made a high-performance model available to the masses. This is how it works. 1/n

13

146

695

Jay Alammar

@JayAlammar

2 years

The Illustrated Retrieval Transformer New post! A visual look at language models that perform on par with GPT3 at 4% of the size.

5

127

628

Jay Alammar

@JayAlammar

3 years

Interfaces for Explaining Transformer Language Models A new blog post (with interactive explorables) to make transformers more transparent. It shows input saliency for generated text, and (VASTLY more interesting) neuron activations

4

154

628

Jay Alammar

@JayAlammar

8 months

What makes LLM tokenizers different from each other? GPT4 vs. FlanT5 Vs. Starcoder Vs. BERT and more Tokenizers are one of the key components of Large Language Models (LLMs). One of the best ways to understand what they do is to compare the behavior of different tokenizers. In

14

70

547

Jay Alammar

@JayAlammar

9 months

Our new short course, “Large Language Models with Semantic Search" is now live! In it, you'll learn how to use LLMs to build the next generation of search systems using concepts like embedding and reranking. Hope you enjoy it! What an incredible honor to

This link will take you to a page that’s not on LinkedIn

lnkd.in

Andrew Ng

@AndrewYNg

9 months

We just released "Large Language Models with Semantic Search”, built with @cohere , and taught by @JayAlammar and @SerranoAcademy . Search is a key part of many applications. Say, you need to retrieve documents or products in response to a user query; how can LLMs help? You’ll

37

610

3K

15

71

474

Jay Alammar

@JayAlammar

2 years

How awesome are those visuals in @pandas_dev Getting Started Who did this? Seriously, kudos!

3

86

467

Jay Alammar

@JayAlammar

29 days

If the rise of LLMs caught you by surprise, here's your chance to get a preview of what's likely to be the next monumental jump in AI capabilities: LLM-backed agents that use software tools In this video, I'll walk you through the concepts and code of building an LLM-backed

5

100

465

Jay Alammar

@JayAlammar

3 years

Probing Classifiers: A Gentle Intro (Explainable AI for Deep Learning) New video! Probing Classifiers are an Explainable AI tool used to make sense of the representations that deep neural networks learn for their inputs.

2

92

450

Jay Alammar

@JayAlammar

2 years

Finetuning Text Embedding Models Achieving peak performance in tasks like text classification and semantic search often requires finetuning an embedding model. This is one of the key intuitions one needs to build when using Large Language Models.

3

63

440

Jay Alammar

@JayAlammar

2 years

We're launching @CohereAI Sandbox – open-source libraries to help developers experiment with language AI I've been working on topic modeling using LLMs: -1-

4

83

438

Jay Alammar

@JayAlammar

2 years

Intro to Basic Semantic Search A gentle guide to building simple semantic search features that go beyond keyword search. Uses sentence embeddings and Annoy to build a "similar questions" feature.

2

81

433

Jay Alammar

@JayAlammar

3 years

Ecco – See what your NLP language model is “thinking” Ecstatic to release my first open-source project! Interactive visualizations in jupyter for @huggingface GPT2-based language models. Github: HN:

7

97

423

Jay Alammar

@JayAlammar

1 year

Big update to "The Illustrated Stable Diffusion" post 14 new and updated visuals. The biggest update is that forward diffusion is more precisely explained -- not as a process of steps (that are easy to confuse with de-noising steps). -1-

6

80

412

Jay Alammar

@JayAlammar

4 months

From the various tools that enable building solutions with large language models (LLMs), DSPy stands out to me as one of the most promising tools for building LLM pipelines. I got to speak to @lateinteraction and ask him to introduce DSPy and what he envisions for its future.

6

39

402

Jay Alammar

@JayAlammar

10 months

ChatGPT has Never Seen a Single Word (Despite Reading All of The Internet). Glance at LLM Tokenizers. New Video! It's fascinating that the actual input to language models is not exactly the text we pass them! Learn more about tokenizers, a key component of LLMs. Link in reply

10

70

404

Jay Alammar

@JayAlammar

2 years

When training binary classifiers in @PyTorch , make sure to use the correct binary loss for your network structure. BCEWithLogitsLoss improves numeric stability, but make sure you pass the actual logit output because it will apply the sigmoid itself.

2

47

395

Jay Alammar

@JayAlammar

4 years

How GPT3 Works - Visualizations and Animations A compilation of my threads explaining GPT3. I'll still post early drafts here on Twitter, but that post is the proper & final home for them all. 1/n of the second thread

1

112

379

Jay Alammar

@JayAlammar

2 years

The Illustrated Retrieval Transformer New video! Language models are improved by giving them the ability to query a database or search the web for information. Here's a look at one way of doing that.

1

65

355

Jay Alammar

@JayAlammar

6 months

Tokenizers, and self-attention both lie at the heart of the LLM boom. Learn about them and more in the most recent post on the newsletter. LLM Tokenizers, Semantic Search Course, And Book Update #2 The update on attention is a teaser to a chapter

2

46

355

Jay Alammar

@JayAlammar

2 years

Software is eating the world. Machine learning is eating software. Transformers are eating machine learning. Oversimplifications, to be sure, but this trail of utility to economic value is evident and we don't yet understand how drastically it will shift economic value. 1/n

jordiae

@jordiae

2 years

AlphaStar (2019) vs. Gato (2022) architectures:

19

198

1K

5

50

328

Jay Alammar

@JayAlammar

2 years

Intro to Large Language Models with Cohere A high-level look at large language models and some of their applications for language processing. It covers text generation models (like GPT) and representation models (like BERT).

2

73

332

Jay Alammar

@JayAlammar

3 years

Inspecting Neural Networks with Canonical Correlation Analysis - A gentle Intro New Video! Methods like CKA, PWCCA, and SVCCA serve as similarity measures revealing to us insights into how a neural network processes its inputs.

4

71

328

Jay Alammar

@JayAlammar

1 year

Despite the Generative AI craze, one of the most exciting and reliably useful areas of AI is not generative at all. It is search. Learn about Neural Search from @Nils_Reimers , creator of Sentence Transformers, and @CohereAI director of ML/embeddings

6

69

324

Jay Alammar

@JayAlammar

3 years

Einsum is a key method in summing and multiplying tensors. It's implemented in @numpy_team , @TensorFlow , AND @PyTorch . Here's a visual intro to Einstein summation functions. 1/n

2

56

317

Jay Alammar

@JayAlammar

4 years

How GPT-3 Works - Easily Explained with Animations A gentle and visual look at how the API/model works under the hood -- including how the model is trained, and how it calculates its predictions. New Video!

How GPT3 Works - Easily Explained with Animations

The GPT3 model from OpenAI is a new AI system that is surprising the world by its ability. This is a gentle and visual look at how it works under the hood --...

www.youtube.com

6

69

316

Jay Alammar

@JayAlammar

3 years

This Intro to Deep Unsupervised Learning is excellent. It's presented by Alec Radford, the first author of papers including GPT, GPT2, DCGAN, and CLIP. Covers word2vec, Glove, RNNs, ELMo, BERT, T5, Electra, and more.

L11 Language Models -- guest instructor: Alec Radford (OpenAI) ---...

Course homepage:https://sites.google.com/view/berkeley-cs294-158-sp20/homeLecture Instructor: Alec Radford (OpenAI)Course Instructors: Pieter Abbeel, Aravind...

www.youtube.com

2

51

318

Jay Alammar

@JayAlammar

2 years

A Visual Guide to Prompt Engineering Large GPT language models are rising in prominence as language processing and generation tools. They can write, paraphrase, and summarize, but they can also classify. This is a gentle starting guide to prompts.

5

67

309

Jay Alammar

@JayAlammar

4 years

How does BERT answer questions? In this explorable, @betty_v_a shows how the layers of BERT successively mutate the representations of input words (question and context) so the correct answer ("bathroom") ends up isolated enough for the model to pick

1

75

308

Jay Alammar

@JayAlammar

1 year

Scatterplots are amazing for exploration. We use them all the time for text (using embeddings). It's the first time I get to explore a music scatter plot -- each point is 3 seconds of music. Fascinating work by @philtgun at

8

43

298

Jay Alammar

@JayAlammar

2 years

Entity Extraction with Large Language Models In this article and notebook, @nickfrosst and I walk you through extracting movie names from r/movies posts using a generative language model.

2

50

293

Jay Alammar

@JayAlammar

2 years

So many exciting things happening in ML these days. DeepMind's Gato is the direction I'm excited about the most. One small-ish model that learns text, images, playing video games, robotic sensors and control. Everything is a sequence! Let's work out how: 1/n

Nando de Freitas 🏳️‍🌈

@NandoDF

2 years

Two years in the making by a talented, collaborative, and fun team, and with enormous help and support from many others at @DeepMind . No better place to be! Congrats @scott_e_reed on this step.

18

30

352

7

47

289

Jay Alammar

@JayAlammar

6 months

The next generation of RAG applications will 1) include a query rewriting step 2) provide citations for its sources. This is an incredible visual guide on how to build it end-to-end. Colab:

Google Colab Notebook

Run, share, and edit Python notebooks

colab.research.google.com

cohere

@cohere

6 months

The Chat endpoint with RAG is easy to use, but it's also customizable. In document mode, the endpoint is highly modular. In this LLM University chapter, learn how to build a RAG-powered chatbot with the Chat, Embed, and Rerank endpoints.

1

27

131

6

61

292

Jay Alammar

@JayAlammar

1 year

New model alert! @CohereAI 's new embedding model supports 100+ languages and delivers 3X better performance than existing open-source models. See the post by @Nils_Reimers and @amrmkayid :

4

38

280

Jay Alammar

@JayAlammar

8 months

I'm writing an updated version of The Illustrated Transformer for the upcoming Hands-On LLMs book I'm co-writing with @MaartenGr . What updates/developments in the past 5 years do you feel should be a definitive addition to an intro to the architecture? Lots of additions to

20

34

272

Jay Alammar

@JayAlammar

5 months

I caught up with @abertsch72 at #NeurIPS2023 , who was presenting Unlimiformer, a retrieval-augmentation method for encoder-decoder models allowing unlimited length inputs. Paper: Unlimiformer: Long-Range Transformers with Unlimited Length Input Work with @urialon1 @gneubig , and

2

44

264

Jay Alammar

@JayAlammar

3 years

Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP) New video! A brief and highly accessible intro to BERT, where you have used it, and the various applications it powers.

2

48

258

Jay Alammar

@JayAlammar

2 years

The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning New video! A brief intro to the classic word embedding method.

The Illustrated Word2vec - A Gentle Intro to Word Embeddings in...

The concept of word embeddings is a central one in language processing (NLP). It's a method of representing words as numerically -- as lists of numbers that ...

www.youtube.com

0

47

254

Jay Alammar

@JayAlammar

2 months

This week, we launched Command-R, which crowns @cohere 's stack of RAG-optimized language models. Join me in any of these upcoming dates as I break down this advanced-RAG, multilingual stack: March 12, 3PM: #SXSW2024 , Austin, Texas. Hacks / Hackers (Sign up:

5

23

238

Jay Alammar

@JayAlammar

4 years

I'm still in awe of and how it visually explains statistical concepts in an interactive manner.

5

50

245

Jay Alammar

@JayAlammar

2 years

Large Language Models for Real-World Applications - A Gentle Intro My talk from @PyData London is now on online! It covers three top LLM use cases we see at @CohereAI (classification, semantic search, text generation). Here are the five main slides:

2

47

239

Jay Alammar

@JayAlammar

1 year

Remaking Old Computer Graphics With AI Image Generation New post! I take Dream Studio, Midjourney, and DALL-E for a test drive: recreating an old video game cinematic. In the end, I share my current impression of these services.

8

36

242

Jay Alammar

@JayAlammar

3 years

Ecstatic to see "Machine learning research communication via illustrated and interactive web articles" published at @rethinkmlpapers workshop at #ICLR2021 In it, I describe my workflow for communicating ML to millions of readers. Paper: 1/5

6

48

235

Jay Alammar

@JayAlammar

4 years

Just published! My "Visual Intro to Machine Learning and Deep Learning" talk at QCon 2020. A gentle intro to ML for software engineers where I go over 10 foundational concepts, 4 applications, and 3 tools to get you started on your journey.

4

50

229

Jay Alammar

@JayAlammar

4 years

The Narrated Transformer Language Model A new video! A high-level overview of transformer language models. It addresses both the transformer architecture and language modeling (as that makes a simpler intro than machine translation)

The Narrated Transformer Language Model

AI/ML has been witnessing a rapid acceleration in model improvement in the last few years. The majority of the state-of-the-art models in the field are based...

www.youtube.com

6

44

229

Jay Alammar

@JayAlammar

3 years

A Gentle Intro to Transformer language models and how makes them more transparent My talk at @PydataKhobar is now live! Thanks to the organizers. Colab:

Language Models and Ecco -- PyData Khobar.ipynb

Run, share, and edit Python notebooks

colab.research.google.com

3

38

221

Jay Alammar

@JayAlammar

3 years

Seeing Voices: 1 - Intro to Spectrograms New video! I have been captivated with this method that visualizes sound. It's used in ML for speech recognition, but is also opening the door to better understand animal communication and intelligence.

4

37

222

Jay Alammar

@JayAlammar

2 years

If you're a visual learner, be sure to check out @MeorAmer1 's Visual Intro to Deep Learning. Meor's ability to create visual language explaining ML concepts is absolutely remarkable.

1

34

219

Jay Alammar

@JayAlammar

5 months

Awesome poster presentation by @Muennighoff for the paper "Scaling Data-Constrained Language Models" at #NeurIPS2023 Kudos @srush_nlp @boazbaraktcs @Fluke_Ellington @olapiktus @Nouamanetazi Sampo Pyysalo @Thom_Wolf @colinraffel

2

29

220

Jay Alammar

@JayAlammar

2 years

Hats off to @psuraj28 @pcuenq @natolambert @PatrickPlaten for this great writeup explaining how Stable Diffusion works. The most helpful for me so far. @AICoffeeBreak video is also great

1

45

215

Jay Alammar

@JayAlammar

1 year

I'm going to be honest. I hyperventilated a little when I saw this dataset internally. All of Wikipedia. Embedded. Passage by passage. Not only English, but 9 other languages as well. Ecstatic to get to put it in your hands

cohere

@cohere

1 year

What could you build if you had the embeddings of ALL of wikipedia? The Embedding Archives: Millions of Wikipedia Article Embeddings in Many Languages We’re publishing ~100 million embedding vectors, covering Wikipedia in 10 languages. Get them now!

187

824

5K

7

30

213

Jay Alammar

@JayAlammar

1 year

Language Models and Machine Learning: What a Time for Language Models

What a Time for Language Models

New capabilities, new products, new possible futures

newsletter.languagemodels.co

9

55

212

Jay Alammar

@JayAlammar

2 years

Ecco v0.1.0 is out! Massive update. - Support for T5, T0, DeBERTa, and ability to add other/local models - Feature attribution via Integrated Gradients and many other methods - Support for Beam Search generation

GitHub - jalammar/ecco: Explain, analyze, and visualize NLP language models. Ecco creates interac...

Explain, analyze, and visualize NLP language models. Ecco creates interactive visualizations directly in Jupyter notebooks explaining the behavior of Transformer-based language models (like GPT2, B...

github.com

5

34

211

Jay Alammar

@JayAlammar

11 months

Good morning #ACL2023NLP ! Excited for my first ACL since Gathertown. Would love to say hi if you're here! I'll be tweeting my experience in this thread over the next few days.

6

9

212

Jay Alammar

@JayAlammar

1 month

AI Agents will take the abilities of LLMs to a whole new level. Here's how to build a simple agent that can use software tools like searching the web or writing and running python code (LLMs love to write @matplotlib code for you).

cohere

@cohere

1 month

Automate your enterprise workflows with Cohere's multi-step tool use. Our generative model Command R+ excels at leveraging external tools to execute complex tasks to streamline business operations. Get started today!

0

15

73

4

37

212

Jay Alammar

@JayAlammar

3 years

Behavioral Testing of ML Models (Unit tests for machine learning) New video! Creating unit tests for ML models gives us higher resolution understanding of model performance -- allowing us to better compare models and observe degradation.

4

48

204

Jay Alammar

@JayAlammar

2 years

Applying massive language models in the real world with @CohereAI This is a round up of some of my recent writings and collaborations on applying large language models at Cohere. They contain a bunch of intuitions for problem solving with LLMs.

0

43

202

Jay Alammar

@JayAlammar

1 year

What's the big deal with Generative AI? Is it the future or the present? New post! This is part 1 of reflections on how best to think of the current state of AI products and features, & avoid pitfalls people tend to make with new tech. Four main points:

8

43

203

Jay Alammar

@JayAlammar

5 months

The scale of #NeurIPS2023 is staggering. This is a look at just one of the poster sessions. If only AI could help us explore / understand / browse / better search all this knowledge..

9

25

202

Jay Alammar

@JayAlammar

8 months

Let's look at different tokenizers in action -- explaining so much of how a LLM "sees" text. New Video! (link in response) We have carefully crafted a piece of text that reveals so much about how a LLM parses its input. We pass it to BERT, GPT4, GPT2, Galactica, Starcoder,

3

39

202

Jay Alammar

@JayAlammar

5 months

LLM-backed agents have been some of the most futuristic LLM directions in 2023. The Voyager paper, presented here by coauthor @yuqi_xie5 at a #neurips2023 workshop, was certainly one of the most fascinating. With the right framing, a (text+code only) LLM can successfully

3

32

201

Jay Alammar

@JayAlammar

3 years

Oversimplified example of self-attention, the concept behind a lot of the current progress in AI/ML. Say a model needs to process the sentence: "A robot must obey the orders given 𝗶𝘁 by human beings" Self-attention helps the model resolve which word "𝗶𝘁" refers to.

5

21

191

Jay Alammar

@JayAlammar

3 years

Favorite AI/ML Books: Intro to ML with Python (Book Review) New Video! I go over the awesome "Intro to ML with Python" by @amuellerml and Sarah Guido. A book that helped me understand many applied ML methods.

4

37

188

Jay Alammar

@JayAlammar

3 years

What are inductive biases? Can models make different predictions when trained on the same data? @RTomMcCoy distills the concept incredibly well in this one graphic. More: Video:

2

38

182

Jay Alammar

@JayAlammar

2 years

Top-k and Top-p are key parameters for controlling the output of GPT models. They are two possible decoding strategies (Or let's call them 'token picking methods') This is a visual look at how they work as the last step in GPT text generation.

2

34

183

Jay Alammar

@JayAlammar

1 month

LLMs are finally breaking free from short context lengths using methods like Ring Attention. Don't miss this visual explainer by @khshind @simonguozirui @bonniesjli

Kilian Haefeli

@khshind

1 month

How do state-of-the-art LLMs like Gemini 1.5 and Claude 3 scale to long context windows beyond 1M tokens? Well, Ring Attention by @haoliuhl presents a way to split attention calculation across GPUs while hiding the communication overhead in a ring, enabling zero overhead scaling

4

69

315

3

28

187

Jay Alammar

@JayAlammar

4 years

Jay's Visual Intro to AI I made a video introducing AI and some of its key business applications. I talk about the motivation of using AI, and the simple trick that lies at the heart of the majority of AI/ML applications in the real world.

Jay's Visual Intro to AI

A gentle visual introduction to Artificial Intelligence (AI) and some of its key commercial applications. In this first video, we explain the simple trick th...

www.youtube.com

5

57

183

Jay Alammar

@JayAlammar

4 years

I like this graphic from a @huggingface notebook on tokenization (). It shows three tokenization schemes with examples, and how vocabulary size increases across different schemes. GPT's tokenization is similar to the one in the middle.

1

44

184

Jay Alammar

@JayAlammar

8 months

One of the best investments you can make in your AI Engineering skillset is to be comfortable with the ideas of using language models for search. In "Using LLMs for Search with Dense Retrieval and Reranking", @SerranoAcademy and I give you the key intuitions for building this

cohere

@cohere

8 months

In our latest blog post, Cohere's Head of Developer Relations @SerranoAcademy and Engineering Director @JayAlammar provide a comprehensive overview of how to use LLMs to power state-of-the-art search.

93

150

1K

2

34

186

Jay Alammar

@JayAlammar

3 months

Guess the animal on the cover of our upcoming Hands-On Large Language Models book for a chance to win a free copy! There's a secret method that assigns the animals of @OReillyMedia books. Even @MaartenGr and I as authors don't even know what the animal is until it is assigned.

138

14

180

Jay Alammar

@JayAlammar

3 years

Favorite python books: Effective Python New video! I go over @haxor 's excellent advanced python book with recommendations on how to make your code more pythonic.

Favorite Python Books: Effective Python

Effective Python is a great book if you know a bit of python and want to take your skills to the next level. It contains tens of tips indicating better ways ...

www.youtube.com

3

21

175

Jay Alammar

@JayAlammar

3 years

Finding the Words to Say: Hidden State Visualizations for Language Models New post! Visualizations glancing at the "thought process" of language models & how it evolves between layers. Builds on awesome work by @nostalgebraist @lena_voita @tallinzen . 1/n

1

38

174

Jay Alammar

@JayAlammar

3 months

AI Agents are some of the most drastic technological changes on the horizon. I asked CMU professor @gneubig about how best to define the current crop of AI agents and where he sees them going. Links to our full conversation are in a reply. We discussed LLM evaluations, new

3

30

164

Jay Alammar

@JayAlammar

3 years

So many fascinating ideas at yesterday's #blackboxNLP workshop at #emnlp2020 . Too many bookmarked papers. Some takeaways: 1- There's more room to adopt input saliency methods in NLP. With Grad*input and Integrated Gradients being key gradient-based methods.

2

37

156

Jay Alammar

@JayAlammar

3 years

Self-attention is an important component of the transformer, but not the only one. Some might misunderstand "Attention is all you need" to mean that all the key computation happens in attention layers. In reality, it's more like "Attention can replace recurrence/convolutions"

3

13

160

Jay Alammar

@JayAlammar

2 years

Combing For Insight in 10,000 Hacker News Posts With Text Clustering New blog post! I embedded and clustered the top HN posts looking for insight on personal/career development. I built an interactive map and found ~700 posts that fit the bill. 1/n

7

40

158

Jay Alammar

@JayAlammar

1 year

AI Art Explained: How AI Generates Images New video! If you want to know how AI generation works and how it's trained, this video is for you! With tens of original figures explaining the internal mechanics of diffusion models.

6

37

160

Jay Alammar

@JayAlammar

3 years

I just learned that the creator of the excellent sklearn cheat sheet is @amuellerml . This comes a day after I shot a video about his excellent ML Intro book which REALLY helped me learn ML when I started out. Technical communication wizard. Coming up next on the YouTube channel

1

22

153

Jay Alammar

@JayAlammar

3 years

The covariance matrix is a an essential tool for analyzing relationship in data. In numpy, you can use the np.cov() function to calculate it (). Here's a shot at visualizing the elements of the covariance matrix and what they mean: 1/5

2

22

156

Jay Alammar

@JayAlammar

3 years

If you're curious how Github Copilot works, this is a gentle intro to GPT3 (the ancestor of Codex, which powers Copilot)

Jay Alammar

@JayAlammar

4 years

How GPT3 works. A visual thread. A trained language model generates text. We can optionally pass it some text as input, which influences its output. The output is generated from what the model "learned" during its training period where it scanned vast amounts of text. 1/n

31

778

2K

4

36

151

Jay Alammar

@JayAlammar

2 years

A Generalist Agent (Gato) - DeepMind's single model learns 600+ tasks New video! Gato's tokenization method maps tasks from text, vision, and control to token sequences learned by a single 1.18B param GPT model.

3

35

153

Jay Alammar

@JayAlammar

4 years

On the transformer side of #acl2020nlp , three works stood out to me as relevant if you've followed the Illustrated Transformer/BERT series on my blog: 1- SpanBERT 2- BART 3- Quantifying Attention Flow (1/n)

2

25

149

Jay Alammar

@JayAlammar

2 years

I had the pleasure of hosting @MaartenGr to speak about BERTopic, and discuss topic modeling, visualization, API design, modularity, and other topics. Watch it now! Episode #1 of Talking Language AI: Overview blogpost:

BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language...

Go in-depth into BERTopic with creator Maarten Grootendorst. We explore three important pillars of the package, modularity, variations, and visualizations. E...

www.youtube.com

4

37

148

Jay Alammar

@JayAlammar

3 years

Ecstatic and honored that was published as an #ACL2021NLP demo paper! Ecco: An Open Source Library for the Explainability of Transformer Language Models v0.0.15 is out now!

1

25

148

Jay Alammar

@JayAlammar

2 years

My talk: Large Language Models for Real-World Applications - A Gentle Intro You may wonder what Kermit The Frog has to do with it..

Jay Alammar - Large Language Models for Real-World Applications - A...

Jay Alammar Presents:Large Language Models for Real-World Applications - A Gentle IntroMachine language understanding and generation has been undergoing rapi...

www.youtube.com

1

17

148

Jay Alammar

@JayAlammar

2 years

A language model thinks this Dune review is negative: "I have a well-documented weakness for sci-fi and expected Dune to feed my soul. I didn't expect it to entirely blow my mind." Which input words lead to this prediction? These. Darker is more important.

6

16

142

Jay Alammar

@JayAlammar

3 years

Be sure to check the awesome NLP course by @lena_voita . It's highly visual, well animated, and even has interactive explorables (scroll down to 'Sampling with temperature' in to get the intuition for the 'temperature' parameter in language models).

Alexis Perrier

@alexip

3 years

Just stumbled upon this fantastic NLP course by @lena_voita Includes embeddings, language modeling, Seq2seq and Attention and more

0

21

101

1

28

141

Jay Alammar

@JayAlammar

2 months

LLM Developers loved Command R, some called it the RAG King, well, hang on till you meet Command R+. Out now! Open weights. Much, much more capable: - Multi-hop RAG: It takes RAG capabilities to a whole new level, when dealing with complex questions, it’s able to search for

Aidan Gomez

@aidangomez

2 months

⌘R+ Welcoming Command R+, our latest model focused on scalability, RAG, and Tool Use. Like last time, we're releasing the weights for research use, we hope they're useful to everyone!

26

190

984

2

21

140

Jay Alammar

@JayAlammar

4 years

I've been enjoying learning the Trax deep learning library (). I've created an intro notebook to the Transformer Language Model (on which GPT is based): It's a great way to start learning how transformer models are built. 1/n

1

28

137

Jay Alammar

@JayAlammar

3 years

We live in an AWESOME age of enlightenment. Oh, you wanna learn about SVD? Have @luis_likes_math break it down for you[1]. Or have @3blue1brown show you how to bend space with your mind (& linear algebra) [2]. Or just attend the whole MIT course [3]. We're incredibly blessed.

6

13

138

Jay Alammar

@JayAlammar

3 years

The Unreasonable Effectiveness of RNNs (Article and Visualization Commentary) New Video! I comment on one of my favorite ML articles which helped me break into ML and NLP. We take a look at its visualizations of neuron firings.

2

13

131

Jay Alammar

@JayAlammar

10 months

Good morning #ACL2023NLP day #3 ! I'll be sharing more notes from the conference in this thread, but also.. POSTER PRESENTATION VIDEOS! If you're here, stop by the @cohere & @forai_ml booth and say hello to @max_nlp @Nils_Reimers @PSH_Lewis @sarahookr @SerranoAcademy

3

13

131

Jay Alammar

@JayAlammar

2 years

Great break down of TF-IDF by @c_brinton and @davidinouye1

4

19

129

Jay Alammar

@JayAlammar

5 months

Hello #NeurIPS2023 ! Looking forward to meeting everybody. Drop by booth 1109 and meet @cohere and @CohereForAI folks and discuss everyone's work ! [noticing @CShorten30 and @ecardenas300 walk-by in the end and, on cue, say hi]

5

14

129

Jay Alammar

@JayAlammar

1 year

#EMNLP2022 here we go!

2

13

126